It looks like I may have jumped the gun with my conclusions from the previous exercise to benchmark string performance in Delphi 2009. Following a useful exchange in the comments with Kryvich I corrected a small discrepancy in the tests and made some changes to the performance testing subsystem within the SmokeTest framework. I then re-ran my string performance benchmarks with some significant – and more encouraging – differences in the results.
I shall not go into the specifics of the tests again – if you’re interested and missed it the first time around you can read my previous post.
The major changes were made in the SmokeTest performance testing subsystem itself:
1. Corrected a bug in the CPU affinity code (oops!)
2. Implemented CPU instruction cache flushing
The CPU affinity fix now correctly ensures that the code under test is executed on the 2nd processor in any N-core or N-processor hardware. The main thread of the SmokeTest framework application itself is assigned to the 1st CPU.
This is intended to eliminate context switching artefacts and main thread impact from the code under test. The bug was that the main thread was inadvertently allowed to run on all available CPU’s – the impact on test results is negligible I think, but even so.
The CPU instruction cache flushing has a much more significant impact and is performed following each execution of a method under test. This has had a significant impact on the raw numbers coming out of the test results (i.e. the number of executions completed per second) but should ensure that the performance of the code itself is tested, not the efficiency of the CPU cache.
The source code being tested is essentially unchanged from that previously used, although it contained one minor correction to the Indexing test case already documented in the comments of that previous post.
I have updated the results download with the new data. This time I have included the Excel spreadsheet with the pretty formatting of the results data comparison and also all three of the raw results files emitted in CSV format by the test app itself.
Observations On The New Results
The new test results actually fit more intuitively with what we might expect, albeit still with one or two noteworthy results.
[dm]10[/dm]
1. Overall Unicode string performance is about 5-10% less efficient than ANSI string handling in Delphi 2007 but is comparable to and in many cases generally improved when compared to Delphi 7.
2. Char-wise operations are the most adversely affected as we might expect but, somewhat surprisingly, simple assignment of strings actually comes out as the most badly affected of the basic operations.
3. ANSI string handling is generally as good as if not slightly better than Delphi 2007 overall, even more so when compared to Delphi 7. Notable exceptions to this remain in the form of the lack of an ANSI implementation for IntToStr() and a significantly slower Replace() implementation.
4. There is still a question mark over the raison d’etre of TStringBuilder, although the difference in performance – whilst still very dramatic – is not perhaps as great as first it appeared.
Revised Conclusion
Concerns w.r.t the performance of ANSI strings were largely misplaced.
There remain a couple of potential gotcha’s in the form of IntToStr() and Replace() but as previously noted, the FastReplace() implementation remains the gold standard for anyone concerned enough to use it (after taking care to ANSI-fy the API of FastStrings itself of course).
Overall string handling performance at worst undoes some of the gains made in this area in Delphi 2007, but I think that is a reasonable trade for the Unicode capabilities added as a result.
I have also learned some valuable lessons that have improved the utility of my SmokeTest framework into the bargain, so thanks to all who questioned and probed the previous results.
Nice work!
Unicode is inevitable and what the Delphi team has done is the best long term solution, but I think it still was a bit of a gamble to “force” everyone to use it, for two reasons:
1) Work/cost to upgrade string handling code, including thirdparty code/components. (As an example, I guess many are waiting to upgrade to Delphi 2009 until DevExpress has updated their grid)
2) Performance hit for those who do not (yet?) need Unicode.
Seems to me that you’ve proven that 2) is essentially a non-issue. 🙂
Good work on your tests! Could you present your data in a chart somehow?
How do you flush the CPU cache?
A Delphi StringBuilder was a big request item from people who wanted to use common code between Win32 and .Net but not have to worry as much about .Net’s string performance.
@Jarle – Indeed, performance seems not to be as big a concern as I for one had feared, although the (potential) memory issue is unavoidable.
@Daniel – Good idea. I hope the chart I knocked up (see updated post) was what you had in mind.
@Bruce – Using FlushInstructionCache.
http://msdn.microsoft.com/en-us/library/ms679350(VS.85).aspx
Although to be honest I couldn’t think of a way to confirm that this works as advertised. 🙂
As far as TStringBuilder is concerned, I was pretty sure that was the reason for having it. I think CodeGear should perhaps make this clearer in the documentation.
Thanks for the reference to FlushInstructionCache.
As for their .Net plans, I’m looking forward to CodeGear’s .Net roadmap.
Nice test!!
I wonder why Delphi 2009 still need its user(programmer) write below functions:
PosIEx()
PosBack()
FastReplace()
that support unicode?
I write them by myself. not satisfied, you know, speed is important!
Bear
Mmm, shouldn’t TStringBuilder be used in the following way to achive actual performance benefits:
1) Create builder instance
2) Set builder Capacity, either as the real predicted size of the future string or just as a big enough amount.
3) Now use convenient Add… methods
4) Get your result string.
Actually this way you’re preallocating memory, preventing its reallocations during additions. And this way you’re using nice methods instead of using SetLength(Result…) and manipulating with Result string or using a memory stream.
Regarding the StringBuilder – AFAIK, the reason it was introduced in .net was not (directly) a performance, but rather the unsuitability of the .net memory manager for frequent small changes in the strings (i.e., changes in their sizes). Indirectly this, of course, affects the performance too.
I don’t know, how far this affects Delphi and its memory manager, and could this be interpreted as preparations to introduction of garbage collection in Delphi.
Nice!
A side question, is a Unicode exe bigger that a non Unicode one?, how much?
@Jolyon. OK, now I trust your results. 🙂 Thanks!
@Bruce – Apparently you’re right about TStringBuilder. But I think this class can be significantly improved on the Win32 side, and it can be even more efficient than the String type in many cases.
Yes that chart is great 🙂 Thanks a lot for your efforts!
@Bruce – I also think there are very interesting times ahead on the .NET side of things… very interesting indeed. And in a good way. 🙂
@Kashmi – I once had the same thought and created my own string builder-like class a while ago specifically to handle specialised cases that I thought could be optimised. e.g. building a delimited list from a known set of elements, where the resulting string size could be pre-calculated and pre-set and then the contents placed directly into the resulting string buffer.
Intuitively this should give some improvement in performance.
In reality I found that the expected performance gains were simply not realised, and performance was actually worse than building the string up using regular concatenation.
I didn’t analyse it too closely. I put it down to the fact that using a class to encapsulate this stuff necessarily introduces it’s own overhead (method calls vs – i.e. on top of – RTL code as well the construction and destruction of the class itself etc).
@Daniel (Luyo) – I haven’t compared EXE sizes and I think doing so is a little tricky as it isn’t possible to compare a Delphi 2009 ANSI executable with a Delphi 2009 Unicode one since of course the former is not possible to produce. We could only compare a 2009 exe with a pre-2009 exe and I think there are too many other factors involved between versions for such a comparison to provide any useful measure of the impact in this area of Unicode specifically.
@Kryvich – Thanks. I feel these results have a more “truthy” feel about them too. 🙂
@Jolyon – do you know something about the .NET version that I don’t ?
Naa, and if you did, you couldn’t tell, right ? 🙂
@Per – I can’t really answer either question.
I don’t know what you know about .NET so I can’t know whether I know something you don’t know, cos I don’t know if you don’t know it or not. But if I did know that I knew something that you didn’t know, then self-evidently I must able to tell that, so being simultaneously in a state where-at I could not tell would create an internal paradox and I would consequently most likely disappear in a puff of logic.
😉
But you only have to look at recent indications from CodeGear themselves that future Delphi.NET releases will not be so concerned with compatibility with Delphi.Win32 and will be more focused on leveraging the latest and greatest .NET technologies.
That in itself is very interesting, no?
Whatever else I may – or may not – know.
🙂