C# Large Array List

Need help with an engine or coding not on the list? Need help with a game or the website and forums here? Direct all questions here.
Post Reply
User avatar
Xaos
Posts: 946
Joined: Wed Jan 11, 2012 4:01 am

C# Large Array List

Post by Xaos »

Hey guys. I'm doing economic simulations, and I need to have an array list with between 300 million to 500 million values in it. I'm doing this and I always get an OutOfMemory exception here:

Code: Select all

var query = from numbers in medianList orderby numbers ascending select numbers;
On this line, I am trying to find the median. Anyway I can bypass this limitation and get my numbers to where I need them to be?
User avatar
Jackolantern
Posts: 10893
Joined: Wed Jul 01, 2009 11:00 pm

Re: C# Large Array List

Post by Jackolantern »

With that many values, it becomes very dependent on what you are actually storing. If it is a complex type, you may literally be running out of memory. Just like SQL, LINQ has to make in-memory models of your data, so working with hundreds of millions of entries is going to be a problem.

You may need to move this into an actual database. Or you may be able to do it without doubling your data in-memory, since finding the median is not hard programmatically. First do a merge sort on your data (merge sort scales well and does not require too much extra memory). Get a count of how many items you are storing. If it is an odd number, divide the number by 2 and take the ceiling of the result. That is the item that contains the median for an odd number of items. If it is even number, divide the number of items by 2, store the value (call it 'A') of the item pointed at by that index, add one to the index, get the number of the item stored at that index (call it 'B'), add A and B together, and divide the result by 2. That is would be your median in the case of an even number of items.
The indelible lord of tl;dr
User avatar
a_bertrand
Posts: 1537
Joined: Mon Feb 25, 2013 1:46 pm

Re: C# Large Array List

Post by a_bertrand »

Be sure to uncheck the "prefer 32 bit" in the build options as otherwise you will be limited in memory usage.
Creator of Dot World Maker
Mad programmer and annoying composer
User avatar
Xaos
Posts: 946
Joined: Wed Jan 11, 2012 4:01 am

Re: C# Large Array List

Post by Xaos »

Thanks guys. I'm going to do uncheck the "Prefer 32-bit" to see if that works. If not, I'll do the merge sort. If that doesn't work I'll create a database. I'm storing longs in the array, because they are incomes.
User avatar
Jackolantern
Posts: 10893
Joined: Wed Jul 01, 2009 11:00 pm

Re: C# Large Array List

Post by Jackolantern »

Yep, you very well may be running out of memory then when the LINQ query runs.
The indelible lord of tl;dr
User avatar
Xaos
Posts: 946
Joined: Wed Jan 11, 2012 4:01 am

Re: C# Large Array List

Post by Xaos »

Poop. Tried the changing of the 32 bit option, and I got it to run tens of millions. When I do hundreds, it runs out when I'm adding to the arrayList. So now I'm going to have to move a DB
User avatar
Jackolantern
Posts: 10893
Joined: Wed Jul 01, 2009 11:00 pm

Re: C# Large Array List

Post by Jackolantern »

Xaos wrote:Poop. Tried the changing of the 32 bit option, and I got it to run tens of millions. When I do hundreds, it runs out when I'm adding to the arrayList. So now I'm going to have to move a DB
Paste in the full LINQ query where it fails, and I will try to get a read on just how memory efficient we are needing to be here. If you only got into tens of millions and you need hundreds of millions, we may have to go more drastic here than a database.

EDIT: By the way, what exactly are you working with to have so many records? lol
The indelible lord of tl;dr
User avatar
Xaos
Posts: 946
Joined: Wed Jan 11, 2012 4:01 am

Re: C# Large Array List

Post by Xaos »

I'm creating an economics simulator, and I'd like to have a full population of the US (300 million)


With the hundreds of millions, it breaks here:

Code: Select all

citizenList.Add(income);
where I'm adding incomes (300 million of them) to the list.

my list is declared as

Code: Select all

public static List<long> citizenList = new List<long>();
User avatar
Jackolantern
Posts: 10893
Joined: Wed Jul 01, 2009 11:00 pm

Re: C# Large Array List

Post by Jackolantern »

It will slow the system way down, but you may need to break the data up and store it on the disk in files so that you can load small parts at a time. Merge sort will still work for this with some modifications. Basically, you load the data, maybe a million or 500,000 at a time into flat files. Only opening one file at a time, sort the contents of each file. Then you merge 2 files into one, maintaining the sorting. To do this you basically you keep comparing the next values in file A and file B (two files you already sorted individually), and keep adding the lower one into a new file C until you have exhausted all the records in one of the files. Then just add the rest of the non-exhausted file into the new file C. Keep merging them in this way until you have one large file with all of the values sorted. You may need to stream the files so that they don't start blowing up memory when you are combing the last few. At that point you can simply iterate through the large file until you reach the values you need.

This is an old trick from the days when main memory cost $20 a byte and programmers often had to sort files 100x larger than main memory coming from magnetic tape lol. Still quite a useful thing to know how to do.
The indelible lord of tl;dr
User avatar
Xaos
Posts: 946
Joined: Wed Jan 11, 2012 4:01 am

Re: C# Large Array List

Post by Xaos »

So let me see if I got this right:

Write a million or so income #s a file
Create a new file
Write another million
Repeat for Rand(300,500) million
Merge sort all of those files in ascending order
Find the median from those files
?

Also total question/hypothetical, not going to switch, but would this be easier in C++ or C?
Post Reply

Return to “Advanced Help and Support”