In the next couple weeks I need to solve a 'large' MS Excel number crunching problem by upgrading or reconfiguring my current work/gaming rig or transitioning the load to a dedicated work-horse (e.g., new Xeon based workstation).
As a starting point/limitation, assume that I am stuck with Excel 2010 64-bit for all time because of its inherent 'transparency' and because of the target audience's general familiarity with it. So, the primary build spec: provide Excel 2010 64-bit with optimal hardware to address my particular application in a rapid fashion (if such a thing can exist).
Regarding my application: the Excel application might be thought of as a logic controller that accepts a mix of analog and digital inputs (~300 per row), performs 1250 calculations per row (translations, error correction, etc.), and outputs a single numeric value after processing the complete data set (~9000 rows). For reference, some of the workbook and processing stats are:
static data/workbook: 9000 rows x 300 columns (~50/50 mix logical/decimal)
lookups/workbook: ~105% x static data
calculations/workbook: 9000 rows x 1250 columns (primarily logic based)
workbook size on disk: 250+ MB
average time to open file: 10+ minutes
average time to perform calculations: 20+ minutes
Current gaming/work rig: http://pcpartpicker.com/p/3DCzE
I suspect that the primary bottlenecks are storage/memory related because the cores are rarely maxed during the calculation process. 'More RAM!' was my initial reaction but that led to questions like: how much more RAM is needed to get a significant improvement in performance? would a different storage strategy/architecture provide significant improvement? at what point does the processor become the bottleneck? would a workstation do it better?
Realizing the potential for conversational tangents and that every Excel problem is different, I think my most important first questions are:
in general, do you suppose Excel is better supported by workstation or PC architecture?
what about database management and numerical computing (e.g., Matlab)?
Then, if workstation architecture is superior
do you suppose the build listed below will offer significant improvement?
what might you change to better suit Excel while maintaining a $6,000 budget?
how much performance improvement might be expected?
is it possible to get similar performance for a lot less money, say, $4,000 total?
Build notes: (primarily compiled as a series of guesses based on your workstation build)
initial budget < $6k USD
must drive 4 monitors
the build still needs hard drives but I wasn't ready to hazard a guess. You mentioned something about SDD read/write strategies with the Marvell controller during your build that got my interest...
otherwise, I have all peripherals
workstation will be 100% dedicated to work
no overclocking <=> must have data redundancy, integrity and system stability
Finally, I have been building PCs since '95 so I am generally comfortable with the process.
First to the question of your current excel problem, it is difficult to say what might be the performance bottleneck, if any, from your description. It would be good to actually examine the system resource use while actually loading or solving/calculating your excel problem. What is the memory usage? What is the cpu usage? Is it scaling to all processors or just a single one? What is the disk usage? Answering these questions might help with figuring out what is actually holding back your performance.
In general, determining the best PC architecture for a numerical problem it really depends on the algorithm being employed to solve it. Is it primarily a serial algorithm or parallel? If it is a serial algorithm then what you really care about is the single threaded performance whereas if it is parallel then the multi-threaded performance is more important.
This follows into the issue of workstation hardware versus desktop hardware. For example, I work with a large numerical model on a daily basis and I use a desktop i7 processor to solve it. Due to the fact that the algorithm used to solve it is serial, I benefit from the high single-threaded performance of the of the i7, using something like a Xeon processor would actually yield decreased performance. A problem that is highly parallel (e.g. a large monte carlo problem in excel) might benefit from the higher core count of a Xeon though.
I don't really now about the performance requirements for data management software, maybe someone else can offer some advice on this.
just gonna throw this out there, but why not try out libreoffice and see how it plays with mantle...that could really accelerate the hell out of the calculations.
Regarding the Excel problem: I believe the following observations suggest that the processors are not overburdened and that RAM capacity is at least one of my problems; specifically,
System Memory usage steadily increases (linear ramp) from ~3GB to ~7GB over the first 2 to 3 minutes of the <Open> process and remains at 7GB for the balance, i.e., 8 to 10 minutes
Total CPU usage is a consistent 10%
Excel is scaling to 6 of the processors with varying load (~20 to 75%),
Each processor exhibits high frequency start/stop (work/wait) behavior (perhaps +/- 25% peak-to-peak).
After the first 30 seconds or so of the <Open> process, System Disk Usage is similar to the processor response, peaky work/wait cycles and total throughput rarely approaches 1 MB/sec
As a test, I took a full length model, i.e., all 9000 rows, and created three smaller models at roughly 25%, 50%, and 75% of the original and then performed the same Open/Calc observations. I found that
the 25% model opened 8-times faster and finished calculations about 20-times faster than the 100% model
the 50% model opened 4-times faster and finished calculation about 10-times faster than the 100% model
the 75% model response was nearly identical to the 100% model
I think the test results confirm that limited RAM is causing a slowdown but I'd appreciate additional thoughts on the matter.
Also, regarding your i7, what amount of RAM are you using?
Regarding the architecture issue: My workbook is probably best characterized as 20 parallel algorithms, each operating on unique 60-variable, 9000-record, data sets. Ideally, more than one instance (or, cases) of the workbook will be open at any given time. I am guessing that 'more processors / more threads' could be helpful in this case. Even so, I am feeling pretty uncertain about the path forward.
Do you think 40 parallel algorithms warrants a Xeon workstation?
Any ideas about how I might build a stronger case one way or the other?
Yeah, it sounds like the ram may be holding you back to some extent. Though it seems the ram you have installed is fairly high performance. You might give gigabuster's suggestion a try on the timings and the increased memory bandwidth on Ivy-E could also help out.
I think part of the problem here may just be with the Excel software. I looked around and it seems that the vlookup and clookup are a major performance issue in Excel. So if there are any ways to optimize your code you might give that a try. Apparently MS recognizes this issue and even offers some performance tips when using Lookups: http://msdn.microsoft.com/en-us/library/ff726673%28v=office.14%29.aspx . I don't know if this is within your requirements for compatibility, but you might want to give Excel 2013 a try, it is definitely faster and should help increase your performance. It is also a lot cheaper than upgrading hardware.
In my case I'm using 16GB of ram with an i7-3770, but for my use loading the model to memory is a small percentage of the solve it is not as important.
That is a good point with regard to having many instances running concurrently. Even if one instance isn't scaling to all threads, multiple ones will, so you benefit from the highly threaded Xeon in those cases.
IMO if you can wait a couple of months I think Haswell-E is supposed to be coming out and this should be a good upgrade with a 8-core i7 extreme and higher memory performance with DDR4. If you can't wait that long, you might give the software optimization stuff a try or Excel 2013 and just upgrade the memory and CPU on your current build to either a i7-4930(K or X) or an ivy-bridge Xeon.
@gigabusterEXE: I tried both 1866 and 1600 options but neither were stable at the suggested rates and stable latencies failed to produce better performance -- Worth a try nonetheless. Thanks!
@TonyCo: I am giving 2013 a try and the MSDN link proved to be a real boon! General testing results shown below.