In some comments on StackOverflow, Jeroen asked me to post my code for reversing bytes. Rather than posting code into that question/answer that wasn’t directly relevant to the question/answer, I decided to quickly throw the code up on here.
The intent with ReverseBytes() is – as the name says – to reverse the byte-order in some byte-order significant value. This sort of operation is typically required when dealing with values in data originating from systems that have a different “endianess” than the Intel x86 architecture.
This most often means on some word or long word sized value (and potentially a huge word, though I’ve not yet encountered that need hence no overload yet exists for that – yet. I should probably add one just in case π ).
Anyway, here are my current implementations of this routine:
interface function ReverseBytes(const aValue: Word): Word; overload; function ReverseBytes(const aValue: LongWord): LongWord; overload; implementation { - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - } function ReverseBytes(const aValue: Word): Word; begin result := (((aValue and $ff00) shr 8) or ((aValue and $00ff) shl 8)); end; { - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - } function ReverseBytes(const aValue: LongWord): LongWord; begin result := (((aValue and $ff000000) shr 24) or ((aValue and $00ff0000) shr 8) or ((aValue and $0000ff00) shl 8) or ((aValue and $000000ff) shl 24)); end;
And, by way of sneak preview, here are the unit tests expressed in my Smoketest framework that test these routines:
procedure TUnitTest_SysUtils.fn_ReverseBytes; const W_init = $0102; DW_init = $01020304; W_reversed = $0201; DW_reversed = $04030201; begin Test('ReverseBytes(word)!').Expect(ReverseBytes(W_init)).Equals(W_reversed); Test('ReverseBytes(long word)!').Expect(ReverseBytes(DW_init)).Equals(DW_reversed); Test('ReverseBytes(ReverseBytes(word))!').Expect(ReverseBytes(ReverseBytes(W_init))).Equals(W_init); Test('ReverseBytes(ReverseBytes(long word))!').Expect(ReverseBytes(ReverseBytes(DW_init))).Equals(DW_init); end;
The Smoketest code is no use without the Smoketest framework itself, obviously, and is provided here as I say only as a sneak preview as I hope and intend to have Smoketest in a releasable form soon.
A slight wrinkle in those plans was the discovery of the lack of support for dotted unit names in [the current “official” release of] FPC, 2.6.x.
I was considering removing the dots from my unit names to address this, but am increasingly of a mind to leave them in. This will mean releasing for Delphi initially and simply waiting for FPC to catch up in this area (dotted unit name support is in 2.7 already, in preview form at least).
The code in the Stack Overflow question is logically wrong. It should not be calling ReverseBytes. That Port parameter to DNSServiceRegister is expected to be in network byte order. You code assumed that it is in reverse of host byte order. Clearly you wrote that code assuming a little endian machine. The code will break as soon as you run it on a big endian machine.
The function you are looking for is htons.
I didn’t write the code assuming a little endian machine, I wrote that code knowing it would be running on a little endian machine (Windows). As such, it is functionally correct for that case.
But I shall add a comment to the code as a note to myself to address this should I ever find myself porting the code to some other implementation of Bonjour on a big endian machine. In that eventuality however, this will be the very, very least of my problems given that the whole thing is bound very tightly in it’s current form to Windows and the Apple Bonjour implementation for Windows.
Sure it’s functionally correct. But expressing explicitly the true logical intent of the code, namely to convert from host to network byte order has at least the following benefits:
1. You can see immediately why the transformation is made.
2. There’s no need for a comment.
3. The code will function correctly wherever you compile it.
4. You can delete ReverseBytes from you code base.
Host and network byte order is a bug factory and so it pays to be clear and precise and follow a strict and established convention.
(*) I first came across the need to “reverse bytes” when working with ANDI chromatogram data in NetCDF files. When taking the position that code should be logically clear and not just functionally correct, it’s somewhat difficult to see how it makes sense to advocate the use of a function specifically and explicitly for dealing with “TCP/IP network byte order” – part of the Sockets API no-less – when dealing with situations that don’t involve networking in any way shape or form. π
It’s no fun trying to help the unwilling. Carry on calling ReverseBytes.
It’s no fun trying to be “helped” by the dogmatic. π
Whatever arguments for using htons() are deployed the fundamental problem that you insist on failing to address is that it doesn’t actually fix the problem. The problem that it does address is entirely theoretical and in practice will never arise, whilst at the same time it would merely have masked the real problem without actually fixing it at all, leaving open the possibility that the underlying problem would/might simply re-appear – perhaps in some other form – later on.
To insist therefore that this is “the ‘right’ way” flies in the face of common sense, irrespective of the dogmatic arguments.
Forget about the error in the SO question. I’m getting at something much more fundamental than that.
It boils down to what you think you are doing. I think you are converting from host to network byte order. So I think you need a function that expresses that intent.
But I guess your view is different. In your view of the problem, you think you are reversing bytes. In which case you are using the appropriate function for that view.
Q: What does converting host to network byte order mean when on a little endian machine ?
Hint: The host is – by definition in the specific context – little endian, and network byte order is big endian. No ? π
I guess you don’t understand my point. Of course htons on a little endian machine reverse the bytes. We clearly have a different view. As I said before, I see this as “convert from host to network”. You see it as “reverse bytes”. We agree to disagree.
I get your point perfectly. You don’t seem willing or able to grasp is the fact that your obsession with this function appears to have blinded you to the fact that the use or otherwise of htons() was simply not the problem. In advocating it’s use you are solving the wrong problem, in fact, solving a problem that doesn’t even exist.
The problem was an incorrectly declared class member, and using htons() would have hidden that problem in this one very isolated case, not fixed it.
That is the key to the problem and whatever the arguments for or against htons(), it is a sideshow that simply ignores the real problem.
Curiously, you seem to have nothing so say about solving the real problem and seem intent on dwelling on this non-problem.
Yes, the non-use of htons was not the original problem. I certainly grasp that. I never once said or suggested otherwise.
A politician would be proud of that response. I take it you are still not going to address the fact that using it would not only have not fixed the problem but would have hidden it ? π
Yes, using htons would have masked your mistake.
Since you are interested in the original problem, I tried to reproduce it. I failed. The only plausible explanation that I can concoct is that the behaviour change did not occur when you switch compilers. But it was introduced when you added the `LongWord` overload of `ReverseBytes`.
program Project1; {$APPTYPE CONSOLE} uses SysUtils; function ReverseBytes(const aValue: Word): Word; overload; begin result := (((aValue and $ff00) shr 8) or ((aValue and $00ff) shl 8)); end; function ReverseBytes(const aValue: LongWord): LongWord; overload; begin result := (((aValue and $ff000000) shr 24) or ((aValue and $00ff0000) shr 8) or ((aValue and $0000ff00) shl 8) or ((aValue and $000000ff) shl 24)); end; procedure WriteWord(w: Word); begin Writeln(IntToHex(w, 4)); end; procedure Main; var i: LongWord; begin i := $0102; WriteWord(ReverseBytes(i));//outputs $0000 as is, but $0201 if the LongWord overload is removed end; begin Main; Readln; end.
That might have been true if it were not for the fact that at the time that the problem first occurred the Word/Longword overload of ReverseBytes was present in both D2007 and D2009 environments. The code was the same in both cases, all that changed was the compiler being used. As described in the SO question, at the time I was simply switching from one compiler to the other and seeing the problem. It wasn’t a case that “I used to use D7 and now I’m on D2009 and things are different” (7/2007 isn’t a typo: at the time of the SO question I was using D7 and 2009, but I since moved up to 2007 for ANSI stuff).
But just to make sure I was not dreaming, I just recreated the exact same set of conditions and the exact same thing occurs (albeit in this case with Delphi 2006 and 2010 rather than 2007/2009):
With Port declared as “Integer”, Delphi 2010 calls the LongWord version of ReverseBytes(), whilst Delphi 2006 calls the Word version. As a result, the word value that ends up being passed in to Bonjour is zero.
Both word and longword versions of ReverseBytes were created at the same time, so it wasn’t the introduction of one or the other that changed anything, it was just changing compilers that highlighted any problem. Ironically it was the fixing of a bug in the compiler that caused (I should say: “revealed”) my problem ! π
I have interest in Smoketest Framework you use. Can you send details on it at gaddlord@mtgstudio.com.
I like in particular the idea of giving the test case name as string parameter.
Will it be Open Source?
Yes, it will be open source, and I shall be posting more details about the framework over the next few weeks while I continue the process of whipping it into shape suitable for public consumption. π
Is the smoketest framework using that Virtual Interface stuff that Nick blogged about?
W
No. The core framework has been created specifically to not use/rely on language features that are only available in later compilers, in order to make it accessible to the widest possible number of users working with the broadest possible code base. Delphi 7 is the base target compiler level.
It’s no good having a fancy, dancy unit testing framework that needs (e.g.) XE3 in order to compile if the code you wish to test is in Delphi 2006. π
However, the framework has been designed to be extensible so that it may be possible to incorporate your own extensions that make use of such things if you wished.
Available from… ? π
Just saw your reply above:
Yes, it will be open source, and I shall be posting more details about the framework over the next few weeks while I continue the process of whipping it into shape suitable for public consumption. π
Oh, Jahhh…. instead of single inline assembler command – that abomination.
Yes, in Pure Pascal mode it should work like that. But for 99% cases a single x86 assembler command does it.
Components using indexing/searching (like TDbf.sf.net) or encryption (like spring4d.org) have such routines routinely.
TDBF.sf.net would probably never be ported to win64, so just to have links persistent: http://code.google.com/p/delphi-spring-framework/issues/detail?id=38
Free Pascal has endian swap functions in base. Use them, don’t declare your own.
would u name them or give links to docs ? they should probably cover 16/32/64 bits on any FPC-supported platform ?
I wonder if they are optimized compiler magic or just the same Wirth-like code Delticz displayed above.
E.g.
(swap endianess)
http://www.freepascal.org/docs-html/rtl/system/swapendian.html
bigendian to native:
http://www.freepascal.org/docs-html/rtl/system/beton.html
littleendian to native:
http://www.freepascal.org/docs-html/rtl/system/leton.html
Avoid the swap() functions. These are old TP compatible functions that swap the high and low part, but only work for 16-bits values. (a bit middleendian style)
XHCG and BSWAP. That’s all you need….
You mean XCHG, if you use ROR it could be more efficient for 16-bit endian conversion.
XE2 Win64 has buggy implementation of XCHG – don’t use it with RAX register !
According to QC that is still so in XE3 and versions before XE2 failed even 16-bit XCHG
PS. And that was not worst of XE2 asm Win64 bugs.
I don’t know how they could make buggy assembler – but they did it.
Making a buggy assembler is pretty simple, just output the wrong opcodes. If the developer overlooks an opcode variant, the generated code will be wrong. One can always use an external assembler easily for those simple calls and link them.
Yes, using htons would have masked your mistake.
I’ve made my point I think.
How does my program behave on D2007? I tested on D6 which is the ANSI Delphi that I have at hand.
Ah, now I see. I’m declaring my variable as LongWord rather than Integer. So my attempt to reproduce the fault was bogus.
I’ve no idea why the compiler would prefer one over the other when passing an Integer. The language is really weak here. Integral value type compat is just a free for all.
In the VM I have here I have 2006, 2010, XE and XE2 installed (XE2 is good as new, never used π ).
In my VM at home I have a slightly more comprehensive setup with D7, 2007, 2010, XE2 and XE3.
But I’m not at home at the moment. π