Page 2 of 56 FirstFirst 12341252 ... LastLast
Results 11 to 20 of 555

Thread: Tests Copying, Pasting, API Cliipboard issues. and Rough notes on Advanced API stuff

  1. #11
    Fuhrer, Vierte Reich DocAElstein's Avatar
    Join Date
    Aug 2014
    Posts
    9,468
    Rep Power
    10
    Some notes in support of this post
    https://www.eileenslounge.com/viewto...297326#p297326 https://www.eileenslounge.com/viewto...297329#p297329




    This is post 11 here and post 17876 in the forum Thread 2824
    Page 2 https://www.excelfox.com/forum/showt...PI-stuff/page2
    Post 11 https://www.excelfox.com/forum/showt...ll=1#post17876
    Post 11 https://www.excelfox.com/forum/showt...age2#post17876





    VBA Win32 API Functioning with String arguments.
    VB Strings

    Originally this page 2 was set aside for some research reporting associated with getting an answer for this main forum question.
    https://eileenslounge.com/viewtopic.php?f=30&t=41659 It still is for some, (now wider), research reporting associated with getting an answer, because ..
    …. A direction to an old VB Strings article/ Book Chapter and a hint of the answer .. , the issue is that with a vb string ByRef passes a pointer to a pointer in to the function, but the API, which knows nothing of BSTRs, assumes it is a pointer to an LPSTR (or LPWSTR). …… has lead / is leading me to think the answer is in a careful examination of the history/ story of the VB / VBA String handling, and in a related deeper sense, the handling of the characters in a computer’s innards.
    I am trying to , and intend finally to, consider the whole chapter, and very very very approximately I am making notes here, adding too them, going off in expanding tangents, and perhaps occasionally ignoring the odd bit. I won’t manage to go through in detail the whole chapter the first time around, and possibly never completely . So for the time being, the last few posts will be like a buffer containing in some cases just a straight copy. The purpose of this, the empty "Buffer" posts, is to not have an undefined/open reserved variable which might do my head in via a bursting leaking of my memory leading to a head explosion.
    Edit Update
    I have been through the whole of chapter 6, and have probably understood as much as I will for a long time. Some of what I was missing was in the earlier chapters so I finally read all up to and including chapter 6. There were a few mistakes that are always painful when trying to learn as it is easy to be sent off course badly. Never the less I could understand about a third of it eventually, which is a massive improvement over a bigger "Bible" recommended to me by many a few years ago
    So this is the book I would recommend: Win32 API Programming with Visual Basic, by Steven Roman.
    I will mention the other one, as everybody does, but I still don't personally recommend it, and certainly I would recommend that Steven Roman book first.
    This is the book I would not personally recommend - Dan Applemans's Visual Basic Programmer's Guide to the Win32 Api.
    Possibly that book is the perfect reference bible once you already know it all. For a beginner or someone like me desperately trying to understand and learn, it does more harm than good, IMHO.
    With the exception of the odd one or two I luckily stumbled over, most prominent "helpers" at places like forums have some phobia, psychological condition or similar that makes thief abilities or desire to share real knowledge zero, or recommend anything useful for learning, despite their claims often that they are there to help and not to provide a quick service which in end effect they mostly seem obsessed with doing, - getting as many quick short answers out as possible, and that just aids the learning algorithms of artificial intelligence to speed up it getting rid of all of us, them included. Real knowledge is then lost. Just short answers that might be right if you are lucky. In the end if they are wrong it won’t matter either, everyone will follow like zombies anyway.

    Finally I expect , I have or will, have here the better alternative, one way or another…







    Book Ref
    https://flylib.com/books/en/4.460.1.9/1/ https://eileenslounge.com/viewtopic....322736#p322736
    https://web.archive.org/web/20121217...n/4.460.1.3/1/
    https://resources.oreilly.com/exampl...piCD/Code_DLLs

    C:\Windows\System32
    Last edited by DocAElstein; 02-11-2025 at 01:50 PM.

  2. #12
    Fuhrer, Vierte Reich DocAElstein's Avatar
    Join Date
    Aug 2014
    Posts
    9,468
    Rep Power
    10
    Some notes in support of these post
    https://www.eileenslounge.com/viewto...297326#p297326 https://www.eileenslounge.com/viewto...297329#p297329
    https://eileenslounge.com/viewtopic....322955#p322955

    This is Post #12 https://www.excelfox.com/forum/showt...ll=1#post17877
    https://www.excelfox.com/forum/showt...age2#post17877




    VB Strings, underlying character grouping and "(en)coding" things
    Strings are made up or characters, and the whole thing is stored on a computer that just has 0s and 1s available. At the outset the issues and problems encountered where identified as likely ( and with hindsight I suggest strongly are ) related to the different ways involved, with the way computers store and pass the characters/ the way they “encode" the character string in technical jargon

    ASCII, "Unicode", ANSI ( Computer Character number History )
    Before talking about VBA Windows API VB Strings, and the underlining associated VB Strings, it could be useful to revise these common computer technical terms and get at least a small understanding. A minor bit of the history is perhaps helpful as well to understand since developments with time have led to some confusion / blurring of facts leading to some difference of opinion amongst the experts.
    This is just intended to get far enough to make much sense of the VB String issues as understanding the different computer character holding arrangement plays a major role in getting the issues solved.

    Part 1 ASCII (and ANSI)
    We are really mainly about ASCII encoding here in this post, but as we will learn, ASCII tends to merge or blur into ANSI often. Often when comparison are made in explaining computer string character encoding, especially in things Microsoft, ANSI is compared with Unicode, and the things talked about in ANSI in those discussions are, in those situations, similar to ASCII things.
    ASCII is for historical reasons a good place to start. It is reasonably well defined.

    ASCII - getting started with computers
    ASCII numbers/ ASCII encoding
    In everyday speaking the word ASCII itself tends to be related to some numbers, very , very very approximately around 200 of them in total. Let's investigate where these ASCII "numbers" come from
    This often gets grouped with or confused with ANSI/ ANSII encoding, and it is not so far off, in general Layman thinking.
    ASCII is short for American Standard Code (numbers really) for Information Interchange, and first came out about 1963, perhaps the first time major thoughts were made on making some standards for computer things.
    Binary thinking’s and basic binary encoding ideas
    Deep inside computers, where things get done in a mathematical sort of a way, characters do not get recognised as we see them as humans. Down there in the computer innards and entrails, the workings play with numbers: things like memory addresses/locations and any character itself will have a number to identify it.
    So let’s start with the 200 or so numbers used in the ASCII convention/Standard to identify characters, and order them like, for example in a list with the number alongside the typical English capitals, starting at 65. The list would look something like this : ( Example of some Number v Character lists Share ‘WunucodeANSI.xlsm’ https://app.box.com/s/20erozqcjs2ljphkiycvbtah08y85fy9 )
    A identified by 65 ( or 41 hexadecimal )
    B identified by 66 ( or 42 hexadecimal )
    etc. etc.
    But computers only have 0s and 1s, not 6s and 5s etc…. so…..
    ….Binary
    If thinking about strings and computer Maths, (words, numbers) at the most fundamental school Maths level, I suppose most people would have at least some idea what Binary is, or if not, then they would at least intuitively have an understanding of the idea, shown graphically below, - an idea that if you had a number of digits, or simple electronic switches, say 7 in total, and they could only be in 2 states, on or off, (which in computer maths we might say 1 or 0), then it does not need an Einstein to figure out that we can get a 7 digit coding idea, based on a set of sequential 0s and 1s to represent numbers in the range 0-127, as shown graphically below
    Code:
    '  2^6  2^5  2^4   2^3  2^2  2^1  2^0
    '   64   32   16    8    4    2    1  
    
    '   0    0    0     0    0    0    0           Binary  way to show 0
    '   0 +  0 +  0 +   0  + 0 +  0 +  0  =   0    Calculating the  Decimal 0 from the Binary code
    
    '   0    0    1     1    1    1    1           Binary  way to show 31
    '   0  + 0  + 16 +  8 +  4 +  2 +  1  =   31   Calculating the Decimal 31 from the Binary code
    
    '   1    1    1     1    1    1    1           Binary way to show  127
    '   64 + 32 + 16 +  8 +  4 +  2 +  1  =  127   Calculating the Decimal 127 from the Binary code
    In discussions of this form, these fundamental 7 binary bits are typically called the bottom 7 Bits, where the term Bits became the technical term generally for these and other single binary characters in any other binary representation. In other words a Bit can be described as a thing that can have two states, on/off or 0/1 etc., or a Bit can be described as a Binary digit.
    8 Bits, = a Byte, ..er we are only using 7, initially
    For a few reasons, Historical and technical mathematical, it comes about that grouping things into 8 Bits is convenient.
    We call 8 Bits grouped together a Byte. What happened/happens to the spare digit is perhaps a bit too involved for this simple Layman explanation, suffice to say we end up describing ASCII as Single Byte encoding using the bottom 7 digits.
    We won’t discuss this yet too much, but to get one possible pictorial idea for the first time of a Byte, in a similar way to the last sketches, we can imagine this
    A Byte ( = 8 Bits = 0/1 , 0/1 , 0/1 , 0/1 , 0/1 , 0/1 , 0/1 , 0/1 )
    Code:
    '  2^7 2^6  2^5  2^4   2^3  2^2  2^1  2^0
    '  128  64   32   16    8    4    2    1  
    '   1   1    1    1     1    1    1    1           Binary way to show  255
    '  128+ 64 + 32 + 16 +  8 +  4 +  2 +  1  =  255   We can represent up to the Decimal 255 using this Binary 8 Bit representation, which is often defined as a Byte
    A computer needs a number for all characters in text
    Computer languages are just a lot of 0s and 1s. They don’t understand the language we speak, but with around 100 - 200 or so computer numbers we can give an identifying number for all common text characters, and that is a start. For all common text characters I am talking about what a Layman might regard as typical letters , numbers and other typical symbols, commas, points, maths symbols etc. (Some people might perceive an empty space between words, and similar positioning things like a Tab offset thing, as a character, of sorts. More to those things later)
    For example, a character we see, A, has in ASCII convention/ Standard the identifying number 65. Here is an abstract from the uploaded file.
    ( It is much clearer to look at the worksheet in the uploaded file, especially as we get a lot more characters to show as we go further in the explanations, and a separate window with all of them is easier to reference when reading all this ( https://i.postimg.cc/gcKJcmY0/Table-around-127.jpg __ https://i.postimg.cc/RhGVgQPt/Table-around-255.jpg ) )
    _____ Workbook: WunucodeANSI.xlsm https://app.box.com/s/20erozqcjs2ljphkiycvbtah08y85fy9
    Row\Col C D E F G H T U V
    2 Anumber Chr(Anumber) ASCII ANSI ** MS list 0-127 ** MS list 128-255 Wunic ChrW(Wunucs) Wiki
    68 65 A A A Latin capital letter A A 65 A A Latin Capital letter A
    69 66 B B B Latin capital letter B B 66 B B Latin Capital letter B
    Worksheet: Tabelle1

    Characters other than simple letters and numbers
    Things / like , : + $ etc etc., are perhaps easy for us to conceive as characters also.
    But there are a few others, sometimes called "invisible" characters, as they may be invisible to us, but for an ignorant computer that is itself not much more than strings of text, it needs some way to identify them things as well in any final string it has.
    These are often called control codes and are the first 31 or so in Ascii
    Depending on your personal perception of things, a space, or a new line in text may or may not be regarded as a text character. A computer language is just strings of text, so that would conveniently conceive those "invisible" things as text characters. Conventionally a simple space is given the identifying number 32, and as for a new line: in the early days of computing, a printer was often used for output, and a printer needed something in the text to cause it to move the thing making the text to go back to the left ( carriage return ) , and it also needed something to cause it to feed in/ notch up or down, for a new line ( line feed ). These have the "Ascii numbers" 10 and 13. As computers have advanced, these two characters have been kept for new lines in text, usually both are used, occasionally just either one is used instead. In VBA coding this new line identifying pair of characters will usually be recognised by
    vbCr & vbLf
    or
    Chr(10) & Chr(13)
    Attempting to show such characters here will depend how everything involved reacts to them, as they may or may not cause a line feed of some sort. In the forum table below they are invisible, they are also invisible the spreadsheet, although in the spreadsheet the line feed seems to be showing as if two lines are present for the line feed character. In the text list in the next post they cause a bit of a mess, breaking the line into two lines at that those points
    _____ Workbook: WunucodeANSI.xlsm https://app.box.com/s/20erozqcjs2ljphkiycvbtah08y85fy9
    Row\Col C D E F G H T U V
    2 Anumber Chr(Anumber) ASCII ANSI ** MS list 0-127 ** MS list 128-255 Wunic ChrW(Wunucs) Wiki
    12 9 HT Horizontal tab * * 9
    13 10 LF Line feed * * 10
    14 11 VT Vertical tab  11
    15 12 FF Form feed  12
    16 13 CR Carriage return * * 13
    17 14 SO Shift out  14
    Worksheet: Tabelle1
    https://i.postimg.cc/Y9K90SGv/Cr-and...xcel-Table.jpg


    https://i.postimg.cc/qRs73dyj/Cr-and...Text-Table.jpg


    Continued in next post……..
    Last edited by DocAElstein; 12-22-2024 at 11:27 PM.

  3. #13
    Fuhrer, Vierte Reich DocAElstein's Avatar
    Join Date
    Aug 2014
    Posts
    9,468
    Rep Power
    10
    ……… continued from last post

    The "first 256 ( 0, 1, 2 …… 255 )"
    Ascii and perpetrating ANSI historical reference misnomers

    7 (8) Bit Byte Ascii

    So these ASCII numbers and character lists, got organised in the ways discussed above, and we end up describing ASCII as Single Byte (which is 8 Bits), encoding using the bottom 7 digits. That might mean, for example that the first digit Bit would be used for something else, but I have not heard of anything about that.
    Having the eight digits, and so the possibility to go up to numbers of 128+127=255, can blur the issue a bit, and we may sometimes be talking of things like "extended ASCII" going up to at least 255. (Of course with 8 digits we can go from [0] to [128+127=255] ). But officially, Ascii is an internationally defined standard for the first 127. Any extensions are not ASCII**.
    At about 1981, for example, IBM got seriously into a PC attempt, and introduced a "Code page 437 (IBM PC)", as most (not all) such things it takes the standard Ascii up to and including 127, then up to 255 had what was an attempt to make the best compromise to make the best chance of character encoding standardisation which suited at the time. There is no formal definition of "extended ASCII"**, and is often mistakenly interpreted to mean that the American National Standards Institute (ANSI) .
    ANSI as applied to an 8-bit character encoding that includes the ASCII characters is not a thing, far less a standard. Microsoft are at least partially to blame for this common error (Microsoft's 8-bit character encoding for the latin character set is actually called Windows-1252 - or cp1252, since we have now moved into the world of Code Pages -, which doesn't really trip of the tongues as easily as ANSI). As they themselves say "The term ANSI as used to signify Windows code pages is a historical reference, but is nowadays a misnomer that continues to persist in the Windows community."

    So when talking about ASCII, we blur into ANSI, which started somewhat later and tends to also have the Byte (so 8 Bits) as its "unit", but generally always does use all available 255 numbers in a Byte if it is used for basic text, mainly to allow extending into some of the main non English characters.
    You should expect to get a bit confused with Bits and Bytes. Don’t worry too much. Hopefully re reading the first few posts in this page will help get it clear.

    The attempt at a large text table in the next post is an abstract from the uploaded file, only a very small part, and talking ASCII or Asc things in everyday usage, or when talking about Ascii or Ansi things in VB/ VBA, usually means talking about things in that first small part, in one way or another.
    For example, talking VB/VBA things, there is a VB/VBA function to get you a typical English text character, ( letter, number, or most typical other all day text characters, including the invisible ones), and a few non English things )
    , Chr()
    , often written in coding something like
    Chr(Asc) or Chr(Anumber)
    , where Asc or Anumber would be an integer number from 0 to 255,
    Note that Asc may not be such a sensible choice for a variable to use in actual coding, since we have a function, Asc(" ") which, as you may guess, does the reverse, getting the Ascii/Ansi number from a single character you put between the " " , like pseudo,
    Asc("A") = 65
    Code:
    Sub ANSIandUnicodeList()  '    , Share ‘WunucodeANSI.xlsm’ https://app.box.com/s/20erozqcjs2ljphkiycvbtah08y85fy9      https://www.excelfox.com/forum/showthread.php/2824-Tests-Copying-Pasting-API-Cliipboard-issues-and-Rough-notes-on-Advanced-API-stuff?p=17877&viewfull=1#post17877
    Rem 1 ASCII ANSI List
    Dim Anumber As Long
        For Anumber = 0 To 255   ' Typical range considered for ASCII or ANSI  Sometimes ASCII is regarded as just 0 - 127
         Let Range("C" & Anumber + 3 & "") = Anumber      ' ASCII / ANSI character "number"
         Let Range("D" & Anumber + 3 & "") = Chr(Anumber)
        Next Anumber
    
    Rem 2 Unicode List
    Dim Wunucs As Long
        For Wunucs = 0 To 65535   ' For Unicode the range is much bigger, 144697 currently, 1111998 possible  , but  65535  seems to be the limit for the ChrW() function
         Let Range("T" & Wunucs + 3 & "") = Wunucs        ' Unicode character "number"
         Let Range("U" & Wunucs + 3 & "") = ChrW(Wunucs)
        Next Wunucs
    End Sub
    Rem 1 in that simple coding would get most of the first things in that table abstract in the next post.
    The first column of characters was got using the Chr() function as in that coding above ( The Chr() function will error for numbers greater than 255 )
    The second two columns of characters are list examples copied from the interment after searching for ASCII and ANSI lists


    It is clearer to look at the worksheet in the uploaded file
    , Share ‘WunucodeANSI.xlsm’ https://app.box.com/s/20erozqcjs2ljphkiycvbtah08y85fy9

    https://i.postimg.cc/cL37bvSH/Lists-...preadsheet.jpg Lists in Excel spreadsheet.jpg





    ANSI (ANSI (or Ascii) historical reference misnomer Perpetration)
    (Some of this may be repeated when we "move up" to Unicode , as the subject of Unicode is often introduced as a comparison, (not technically completely accurate) in writings titled ANSI v Unicode, with the word Unicode also having its degree or false use)
    These historical reference misnomers are a nice human tradition that should be continued, IMO, as it can help fool Chat GPT learning algorithms.
    The term ANSI can often be used incorrectly, as a historical misnomer, when discussing the "first 256"

    ANSI is the American National Standards Institute, which has been around since 1910 so maybe it was thought of originally of more everyday stuff. They only got around to thinking about computing standards in the 1980’s
    Some historical reports and opinions suggest it was originally intended not to specifically define the lists, but rather to discuss and set rules for controlling different Lists, in the range 0 to 255, whereby mostly, ( but not always ) the first 128 are the same
    As mentioned when IBM got seriously into a PC attempt, they introduced a "Code page 437 (IBM PC)" "extended ASCII" going up to at least 255. It seems that possibly lots of people had their own ideas of what should go where in the space from 128 to 255.
    ANSI, in computing was/is a second American standards idea to the Ascii. Developments at that time may have overwhelmed them.
    If we are talking Microsoft things, we might say they hijacked ANSI a bit, before ANSI had completed their initial discussions, Microsoft having a code page for a system to define something similar to ASCII extended, but not always exactly the same*. It was / is used, but mostly gave/ gives way to Unicode. Because of this strange history, around the mid 1980’s, Microsoft talked about ANSI as the new improved thing compared to ASCII - It became a general term, a historical reference misnomer, for the default code page of a given operating system, such as Windows*.
    If we are talking Microsoft VB things, we might say VB only understands ANSI characters
    From the code page you can go on to find the actual characters




    Small differences leading to corruptions
    It was like this: ANSI, the institution, people where in the process of defining a standard for how to display character data on a computer in the 1980’s. Their start point was the 8 Bit, single Byte, 0-255 numbers, and they thought the first 127 where best left at the already known and used Ascii Standard, but they also recognised that the second 128 characters would not be enough, so they also first introduced the idea they called a single code page, to define the second upper 128 character number, or code point in memory, to be a bit more technically explicit. So the idea would be that a code page that made the best compromise in their opinion or bribed by lobbying opinion, fort the best possibly world wide exchange of text information. But they were not finished, when commercial pressures forced Microsoft to highjack the idea and take it further a bit more quickly. Microsoft ended up with many code pages, and other people developed their code page as well.
    Finally ANSI came up with the ISO 8859-1 standard. Microsoft had at some point a code page very close to this, Windows code page 1252. There can be other slight deviations from the actual first 256 in use anywhere. This can lead to problems such as we had at excelfox.com a few years ago
    https://www.excelfox.com/forum/showt...ll=1#post15250
    https://www.excelfox.com/forum/showt...1-*/#post15236
    https://www.excelforum.com/developme...ssue-test.html
    https://www.excelforum.com/developme...haracters.html
    https://www.excelforum.com/developme...x-decimal.html

    Attached Files Attached Files
    Last edited by DocAElstein; 12-26-2024 at 10:27 PM.

  4. #14
    Fuhrer, Vierte Reich DocAElstein's Avatar
    Join Date
    Aug 2014
    Posts
    9,468
    Rep Power
    10
    Text output of the first 255 in typical lists.
    These are generally regarded as the ASCII or ANSI things, which cover most all day English character things.
    This just gives a general idea, since the columns are a bit messed up and things are not always showing correctly in the following attempt. Best is to look at the worksheet in the uploaded file, as an Excel spreadsheet generally shows a very large number of these things correctly. A text file is not too bad, but the positioning of the columns often gets messed up
    Code:
    Anumber	Chr(Anumber)	ASCII	ANSI	** MS list 0-127	** MS list 128-255
    0		NUL        Null character	 		
    1		SOH        Start of header	 		
    2		STX        Start of text	 		
    3		ETX        End of text	 		
    4		EOT        End of transmission	 		
    5		ENQ        Enquiry	 		
    6		ACK        Acknowledgment	 		
    7		BEL        Bell, alert	 		
    8		BS        Backspace	 	* *	
    9	"	"	HT        Horizontal tab	 	* *	
    10	"
    "	LF        Line feed	 	* *	
    11		VT        Vertical tab	 		
    12		FF        Form feed	 		
    13	"
    "	CR        Carriage return	 	* *	
    14		SO        Shift out	 		
    15		SI        Shift in	 		
    16		DLE        Data link escape	 		
    17		DC1        Device Control 1 (XON)	 		
    18		DC2        Device Control 2	 		
    19		DC3        Device Control 3 (XOFF)	 		
    20		DC4        Device Control 4	 		
    21		NAK        Negative acknowledgment	 		
    22		SYN        Synchronous idle	 		
    23		ETB        End of transmission block	 		
    24		CAN        Cancel	 		
    25		EM        End of medium	 		
    26		SUB        Substitute	 		
    27		ESC        Escape	 		
    28		FS        File separator	 		
    29		GS        Group separator	 		
    30		RS        Record separator	 		
    31		US        Unit separator	 		
    32	 	SP        Space	           space	[space]	
    33	!	!        exclamation mark	!        exclamation mark	!	
    34	"	"        double quote	"        quotation mark	"	
    35	#	#        number sign	#        number sign	#	
    36	$	$        dollar sign	$        dollar sign	$	
    37	%	%        percent sign	%        percent sign	%	
    38	&	&        ampersand	&        ampersand	&	
    39		'        single quote	'        apostrophe	'	
    40	(	(        left/opening parenthesis	(        left parenthesis	(	
    41	)	)        right/closing parenthesis	)        right parenthesis	)	
    42	*	*        asterisk	*        asterisk	*	
    43	+	+        plus sign	+        plus sign	+	
    44	,	,        comma	,        comma	,	
    45	-	-        minus or hyphen	-        hyphen-minus	-	
    46	.	.        Period, dot	.        full stop	.	
    47	/	/        forward slash	/        solidus	/	
    48	0	0        	0        digit zero	0	
    49	1	1        	1        digit one	1	
    50	2	2        	2        digit two	2	
    51	3	3        	3        digit three	3	
    52	4	4        	4        digit four	4	
    53	5	5        	5        digit five	5	
    54	6	6        	6        digit six	6	
    55	7	7        	7        digit seven	7	
    56	8	8        	8        digit eight	8	
    57	9	9        	9        digit nine	9	
    58	:	:        colon	:        colon	:	
    59	;	;        semi-colon	;        semicolon	;	
    60	<	<        less than	<        less-than sign	<	
    61	=	=        equals	=        equals sign	=	
    62	>	>        greater than	>        greater-than sign	>	
    63	?	?        question mark	?        question mark	?	
    64	@	@        At sign	@        commercial at	@	
    65	A	A        	A        Latin capital letter A	A	
    66	B	B        	B        Latin capital letter B	B	
    67	C	C        	C        Latin capital letter C	C	
    68	D	D        	D        Latin capital letter D	D	
    69	E	E        	E        Latin capital letter E	E	
    70	F	F        	F        Latin capital letter F	F	
    71	G	G        	G        Latin capital letter G	G	
    72	H	H        	H        Latin capital letter H	H	
    73	I	I        	I        Latin capital letter I	I	
    74	J	J        	J        Latin capital letter J	J	
    75	K	K        	K        Latin capital letter K	K	
    76	L	L        	L        Latin capital letter L	L	
    77	M	M        	M        Latin capital letter M	M	
    78	N	N        	N        Latin capital letter N	N	
    79	O	O        	O        Latin capital letter O	O	
    80	P	P        	P        Latin capital letter P	P	
    81	Q	Q        	Q        Latin capital letter Q	Q	
    82	R	R        	R        Latin capital letter R	R	
    83	S	S        	S        Latin capital letter S	S	
    84	T	T        	T        Latin capital letter T	T	
    85	U	U        	U        Latin capital letter U	U	
    86	V	V        	V        Latin capital letter V	V	
    87	W	W        	W        Latin capital letter W	W	
    88	X	X        	X        Latin capital letter X	X	
    89	Y	Y        	Y        Latin capital letter Y	Y	
    90	Z	Z        	Z        Latin capital letter Z	Z	
    91	[	[        left/opening square bracket	[        left square bracket	[	
    92	\	\        back slash	\        reverse solidus	\	
    93	]	]        right/closing square bracket	]        right square bracket	]	
    94	^	^        caret/cirumflex	^        circumflex accent	^	
    95	_	_        underscore	_        low line	_	
    96	`	`        	`        grave accent	`	
    97	a	a        	a        Latin small letter a	a	
    98	b	b        	b        Latin small letter b	b	
    99	c	c        	c        Latin small letter c	c	
    100	d	d        	d        Latin small letter d	d	
    101	e	e        	e        Latin small letter e	e	
    102	f	f        	f        Latin small letter f	f	
    103	g	g        	g        Latin small letter g	g	
    104	h	h        	h        Latin small letter h	h	
    105	i	i        	i        Latin small letter i	i	
    106	j	j        	j        Latin small letter j	j	
    107	k	k        	k        Latin small letter k	k	
    108	l	l        	l        Latin small letter l	l	
    109	m	m        	m        Latin small letter m	m	
    110	n	n        	n        Latin small letter n	n	
    111	o	o        	o        Latin small letter o	o	
    112	p	p        	p        Latin small letter p	p	
    113	q	q        	q        Latin small letter q	q	
    114	r	r        	r        Latin small letter r	r	
    115	s	s        	s        Latin small letter s	s	
    116	t	t        	t        Latin small letter t	t	
    117	u	u        	u        Latin small letter u	u	
    118	v	v        	v        Latin small letter v	v	
    119	w	w        	w        Latin small letter w	w	
    120	x	x        	x        Latin small letter x	x	
    121	y	y        	y        Latin small letter y	y	
    122	z	z        	z        Latin small letter z	z	
    123	{	{        left/opening curly brace	{        left curly bracket	{	
    124	|	|        vertical bar	|        vertical line	|	
    125	}	}        right/closing curly brace	}        right curly bracket	}	
    126	~	~        equivalency sign, tilde	~        tilde	~	
    127		DEL        delete	        (not used)		
    128	€	        	€        euro sign	 	€
    129		        	        (not used)	 	
    130	‚	        	‚        single low-9 quotation mark	 	‚
    131	ƒ	        	ƒ        Latin small letter f with hook	 	ƒ
    132	„	        	„        double low-9 quotation mark	 	„
    133	…	        	…        horizontal ellipsis	 	…
    134	†	        	†        dagger	 	†
    135	‡	        	‡        double dagger	 	‡
    136	ˆ	        	ˆ        modifier letter circumflex accent	 	ˆ
    137	‰	        	‰        per mille sign	 	‰
    138	Š	        	Š        Latin capital letter S with caron	 	Š
    139	‹	        	‹        single left-pointing angle quotation mark	 	‹
    140	Ś	        	Œ        Latin capital ligature OE	 	Œ
    141	Ť	        	        (not used)	 	
    142	Ž	        	Ž        Latin capital letter Z with caron	 	Ž
    143	Ź	        	        (not used)	 	
    144		        	        (not used)	 	
    145	‘	        	‘        left single quotation mark	 	‘
    146	’	        	’        right single quotation mark	 	’
    147	“	        	“        left double quotation mark	 	“
    148	”	        	”        right double quotation mark	 	”
    149	•	        	•        bullet	 	•
    150	–	        	–        en dash	 	–
    151	—	        	—        em dash	 	—
    152	˜	        	˜        small tilde	 	˜
    153	™	        	™        trade mark sign	 	™
    154	š	        	š        Latin small letter s with caron	 	š
    155	›	        	›        single right-pointing angle quotation mark	 	›
    156	ś	        	œ        Latin small ligature oe	 	œ
    157	ť	        	        (not used)	 	
    158	ž	        	ž        Latin small letter z with caron	 	ž
    159	ź	        	Ÿ        Latin capital letter Y with diaeresis	 	Ÿ
    160	 	        	        no-break space	 	
    161	ˇ	        	¡        inverted exclamation mark	 	¡
    162	˘	        	¢        cent sign	 	¢
    163	Ł	        	£        pound sign	 	£
    164	¤	        	¤        currency sign	 	¤
    165	Ą	        	¥        yen sign	 	¥
    166	¦	        	¦        broken bar	 	¦
    167	§	        	§        section sign	 	§
    168	¨	        	¨        diaeresis	 	¨
    169	©	        	©        copyright sign	 	©
    170	Ş	        	ª        feminine ordinal indicator	 	ª
    171	«	        	«        left-pointing double angle quotation mark	 	«
    172	¬	        	¬        not sign	 	¬
    173	*	        	*        soft hyphen	 	*
    174	®	        	®        registered sign	 	®
    175	Ż	        	¯        macron	 	¯
    176	°	        	°        degree sign	 	°
    177	±	        	±        plus-minus sign	 	±
    178	˛	        	²        superscript two	 	²
    179	ł	        	³        superscript three	 	³
    180	´	        	´        acute accent	 	´
    181	µ	        	µ        micro sign	 	µ
    182	¶	        	¶        pilcrow sign	 	¶
    183	•	        	•        middle dot	 	•
    184	¸	        	¸        cedilla	 	¸
    185	ą	        	¹        superscript one	 	¹
    186	ş	        	º        masculine ordinal indicator	 	º
    187	»	        	»        right-pointing double angle quotation mark	 	»
    188	Ľ	        	¼        vulgar fraction one quarter	 	¼
    189	˝	        	½        vulgar fraction one half	 	½
    190	ľ	        	¾        vulgar fraction three quarters	 	¾
    191	ż	        	¿        inverted question mark	 	¿
    192	Ŕ	        	À        Latin capital letter A with grave	 	À
    193	Á	        	Á        Latin capital letter A with acute	 	Á
    194	Â	        	Â        Latin capital letter A with circumflex	 	Â
    195	Ă	        	Ã        Latin capital letter A with tilde	 	Ã
    196	Ä	        	Ä        Latin capital letter A with diaeresis	 	Ä
    197	Ĺ	        	Å        Latin capital letter A with ring above	 	Å
    198	Ć	        	Æ        Latin capital letter AE	 	Æ
    199	Ç	        	Ç        Latin capital letter C with cedilla	 	Ç
    200	Č	        	È        Latin capital letter E with grave	 	È
    201	É	        	É        Latin capital letter E with acute	 	É
    202	Ę	        	Ê        Latin capital letter E with circumflex	 	Ê
    203	Ë	        	Ë        Latin capital letter E with diaeresis	 	Ë
    204	Ě	        	Ì        Latin capital letter I with grave	 	Ì
    205	Í	        	Í        Latin capital letter I with acute	 	Í
    206	Î	        	Î        Latin capital letter I with circumflex	 	Î
    207	Ď	        	Ï        Latin capital letter I with diaeresis	 	Ï
    208	Đ	        	Ð        Latin capital letter Eth	 	Ð
    209	Ń	        	Ñ        Latin capital letter N with tilde	 	Ñ
    210	Ň	        	Ò        Latin capital letter O with grave	 	Ò
    211	Ó	        	Ó        Latin capital letter O with acute	 	Ó
    212	Ô	        	Ô        Latin capital letter O with circumflex	 	Ô
    213	Ő	        	Õ        Latin capital letter O with tilde	 	Õ
    214	Ö	        	Ö        Latin capital letter O with diaeresis	 	Ö
    215	×	        	×        multiplication sign	 	×
    216	Ř	        	Ø        Latin capital letter O with stroke	 	Ø
    217	Ů	        	Ù        Latin capital letter U with grave	 	Ù
    218	Ú	        	Ú        Latin capital letter U with acute	 	Ú
    219	Ű	        	Û        Latin capital letter U with circumflex	 	Û
    220	Ü	        	Ü        Latin capital letter U with diaeresis	 	Ü
    221	Ý	        	Ý        Latin capital letter Y with acute	 	Ý
    222	Ţ	        	Þ        Latin capital letter Thorn	 	Þ
    223	ß	        	ß        Latin small letter sharp s	 	ß
    224	ŕ	        	à        Latin small letter a with grave	 	à
    225	á	        	á        Latin small letter a with acute	 	á
    226	â	        	â        Latin small letter a with circumflex	 	â
    227	ă	        	ã        Latin small letter a with tilde	 	ã
    228	ä	        	ä        Latin small letter a with diaeresis	 	ä
    229	ĺ	        	å        Latin small letter a with ring above	 	å
    230	ć	        	æ        Latin small letter ae	 	æ
    231	ç	        	ç        Latin small letter c with cedilla	 	ç
    232	č	        	è        Latin small letter e with grave	 	è
    233	é	        	é        Latin small letter e with acute	 	é
    234	ę	        	ê        Latin small letter e with circumflex	 	ê
    235	ë	        	ë        Latin small letter e with diaeresis	 	ë
    236	ě	        	ì        Latin small letter i with grave	 	ì
    237	í	        	í        Latin small letter i with acute	 	í
    238	î	        	î        Latin small letter i with circumflex	 	î
    239	ď	        	ï        Latin small letter i with diaeresis	 	ï
    240	đ	        	ð        Latin small letter eth	 	ð
    241	ń	        	ñ        Latin small letter n with tilde	 	ñ
    242	ň	        	ò        Latin small letter o with grave	 	ò
    243	ó	        	ó        Latin small letter o with acute	 	ó
    244	ô	        	ô        Latin small letter o with circumflex	 	ô
    245	ő	        	õ        Latin small letter o with tilde	 	õ
    246	ö	        	ö        Latin small letter o with diaeresis	 	ö
    247	÷	        	÷        division sign	 	÷
    248	ř	        	ø        Latin small letter o with stroke	 	ø
    249	ů	        	ù        Latin small letter u with grave	 	ù
    250	ú	        	ú        Latin small letter u with acute	 	ú
    251	ű	        	û        Latin small letter with circumflex	 	û
    252	ü	        	ü        Latin small letter u with diaeresis	 	ü
    253	ý	        	ý        Latin small letter y with acute	 	ý
    254	ţ	        	þ        Latin small letter thorn	 	þ
    255	˙	        	ÿ        Latin small letter y with diaeresis	 	ÿ
    ' **Note: regarding the two columns with MS "ANSI lists" 0-127 and128-255( Columns G and H in uploaded file)
    ' Microsoft Character list 0-127, - is usually the same as ASCII
    ' Microsoft Character list 128-255, - Windows default. However, values in the ANSI character set above 127 are determined by the code page specific to your operating system. (Windows-1252 or CP-1252 (Windows code page 1252) is a legacy single-byte character encoding that is used by default (as the "ANSI code pageage") Microsoft Windows





    Share ‘WunucodeANSI.xlsm’ https://app.box.com/s/20erozqcjs2ljphkiycvbtah08y85fy9
    Attached Files Attached Files
    Last edited by DocAElstein; 12-26-2024 at 06:45 PM.

  5. #15
    Fuhrer, Vierte Reich DocAElstein's Avatar
    Join Date
    Aug 2014
    Posts
    9,468
    Rep Power
    10

    Unicode and (Microsoft) ANSI

    Unicode ( and (Microsoft) ANSI )

    "ANSI " as slang for the "first 256 " : Code Page __ AASI
    ANSI as a thing was an attempt to control the different Character – Number lists, in particular the lists under the number 256, rather than being or creating any lists, although some initial draft suggestions were given. Microsoft got into this, some people say hijacked it, taking it over in the mid 1980's, adopting the provisional ANSI suggestions in their first windows in 1985.
    The idea at the time centred around giving some identifying name / number for different lists, in the 1 Byte/ 8 Bit lists as discussed so far. In the early to mid-1980's, one main draft suggestion for a 0-255 to character list which covered most common European characters. One of the early Microsoft Lists, which they called a Windows code page, Windows code page 1252 was originally based on that ANSI draft suggestion . This list is very similar to the one from ANSI eventually adopted by the International Organization for Standardization (ISO) Standards, Standard 8859-1. (The idea of the code page was introduced by Microsoft about 1987, It became a general term for a table or list of character codes( numbers) and their corresponding glyphs (characters)
    Because of this history, around the mid 1980’s, Microsoft talked about ANSI as the new improved thing compared to ASCII ANSI became a general term for the default code page of a given Operating System, such as Windows, .
    For our interest as background to Microsoft VB Strings, when we talk about the characters used under 256 we should probably be referring to the Windows code Page. But following the convention of the historical reference misnomer Perpetration, we would tend to say ANSI.
    Such perpetrations feature a lot in comparisons….. examples:
    Generally the first 255 list will be referred to as ANSI characters to the 8-bit ANSI code
    We might say VB only understands ANSI characters, despite deep in the workings having ……. Unicode xxxxx …….
    One way or another, the word code page or ANSI will likely be used as a general term most likely referring to a single Byte, 8 Bit workings in the first 255
    A VB string would be described as a ANSI string, under the correct historical reference misnomer Perpetration
    I would perhaps suggest perhaps a term such as AASI to be equivalent to "ANSI", where "ANSI" would be likely to be used under the correct historical reference misnomer Perpetration, and this new term AASI could be thought of as a generical term for what people are most likely be referring to as goings-on in a computer either with, or arising from, character processes involved mostly with the first typical 256 (0-255) characters

    Unicode
    Inevitably something had to be thought about that covered as many things as possible, every character in every language worldwide, and even small pictures, symbols, smiles etc etc.
    This came about around 1990, under the general word of Unicode. It is a general term. The idea was to create rules for assigning numbers to all characters on the planet Earth, where characters would extend to things perhaps better described as small unit things typically seen in writings wherever on the plane earth. Maybe like the ANSI idea with an open top end.
    The term Unicode is often used imprecisely to refer to whichever Unicode "encoding"** that particular system uses by default. Unicode encoding includes things going by the names of UTF-8, UTF-16, and UTF-32
    Putting it another way: Loosely, Unicode is a text encoding standard that defines characters used in various ordinary, literary, academic, and technical contexts in various languages and assigns them abstracted "code points" (numbers). The "encoding" formats (UTF-8, UTF-16 AND UTF-32), on the other hand, define how to translate the standard's abstracted codes for characters into sequences of bytes and thus how they are actually stored in memory)
    A few technical terms
    _ From ANSI we have the idea 1 Character – 1 number. We make the same thing a bit more advanced sounding by saying 1 Character = 1 Code Point**
    _ **The representation in computer memory of a single character (how we write it down), was discussed / shown by the 7 and 8 Bit binary ( 1 Byte ) diagrams for a single decimal "code point" for Ascii/ANSI. We use the smart sounding word Encoding usually in Unicode discussions: Encoding = representation in computer memory of a single character / "how we write it down"
    _ In addition to the above we might throw in the word mapping as it sounds good, from time to time. Mostly it is a filler word that probably could be left out completely, as it is usually a general term for any data structure with 'associates' one data value with another.

    2 Byte 16 bit stuff (Unicode Encoding)
    All this ASCII, "Unicode", ANSI stuff on this page 2 is intended as a background to Microsoft VB Strings, which is in turn intended as a background in learning Win32 API in VBA. So it will be biased towards the Microsoft Unicode Encoding. But just briefly, to put it in perspective, here is a short summary of some of the different encodings, to note briefly but not necessarily needed to understand fully yet:
    Code:
    UTF-16/ UTF-8
    UTF-8 is variable 1 to 4 bytes. – This can be efficient to use for more simple text, (and this has had a bit of a resurgence in recent years, due to some simple text stuff associated with the SmartPhone short message & co.). It may have come after the next one did not take off so well. 
    UTF-16 is variable 2 or 4 bytes, - but mostly 2, and mostly what Microsoft use. This initial main idea in Unicode encoding is based on using 2 Bytes initially, fixed. The origin is blurred a bit. Backward compatibility is/was hampered a bit, so with this encoding of Unicode, Unicode was impractical, which led to UTF-8. For the first 128, UTF-8 is 1 Byte: UTF-16 is not backwards compatible with ASCII (or any of the ASCII-inclusive 8-bit character encodings). UTF-8, on the other hand is 100% backwards compatible with ASCII
    
    UTF-32 is fixed 4 bytes.
    Whenever Microsoft say Unicode in Windows, they almost always actually mean UTF-16
    If we wish to concentrate on a working understanding to move forward with VB strings, it may be sufficient to consider that Microsoft’s Unicode encoding is to a first approximation like a 2 column base 256 the wrong way around
    A quick working understanding can be got pictorially by comparing and extending the 8 Bit single Byte mapping diagram showing the mapping of a character’s decimal code point to the internal computer binary
    Take as example , the decimal number 8230 which in Unicode is the decimal number for a single character looking like 3 small dots close together …
    The following sketch shows the code point in UTF-16 2-Byte LE encoding (LE: Little Endian = The wrong way around)
    Code:
     '            Low-end Byte                                        High-end Byte
    '  2^7 2^6  2^5  2^4   2^3  2^2  2^1  2^0          2^7 2^6  2^5  2^4   2^3  2^2  2^1  2^0
    '  128  64   32   16    8    4    2    1           128  64   32   16    8    4    2    1
    '   0   0    1    0     0    1    1    0            0   0    1    0     0    0    0    0
    '   0 + 0  + 32 + 0  +  0 +  4 +  2 +  0  =  38     0 + 0  + 32 + 0  +  0 +  0 +  0 +  0  =  32
    '
    '                 255^0                                          256^1
    '                    1                                          256
    '                      38                                      32
    '                       ( 38 x 1 )               +         ( 32 x 256 ) = 8230                                    - calculating the decimal 8230
    ' Using hexadecimal as the final column numbers, we would have 20 26 , likely seen in literature as  U+2026
     
    Study that sketch and it should all look reasonable. In words: The fundamental unit previously was a Byte, 8 Bits. We could get from 0 – 255 with that. Using 2 Bytes one possibility could be to add the numbers giving 0 to 510, but defining the two numbers as in a base system ( 256 in this case ) gives us a much wider range 0 to (255+(256x255))=65535
    (By the way, I never heard of this 2 column base 256 idea before in any explanation. I just noticed that it is that, the UTF-16 2-Byte LE encoding.)
    Typically in an explanation we might see this as written
    38 32 ( or in hexadecimal notation 20 26 )
    More likely, however, an explanation tends to not take any larger code point number as example, considering something like the character A, which would look like
    65 00 ( or in hexadecimal notation 42 00 )
    It then can misleadingly talk about the unused 00 separating characters, which only appears so for the lower code points.

    The story again with a few more technical details
    There seems to be some tradition of adding something before the 4 digit Unicode UTF-16 2-Byte number , such as U+, to give an indication that we are in a Unicode encoding. In this case we would most likely be in hexadecimal, since the largest 4 digit hexadecimal in this number arrangement fits nicely, as 255 decimal is FF in hexadecimal, the largest 2 digit hexadecimal number. So the U+ means “Unicode” and the numbers are hexadecimal. U+0639 is the Arabic letter Ain. The English letter A would be U+0041. In the meantime we officially have a new name in place of character, a "grapheme" which is defined as the smallest functional unit of a writing system, and assigned a magic number by the Unicode consortium. This magic number is called a code point. In fact a more recent development means that a combination of code points can define a final thing, for example, a basic shape had a code point and then one of a few other code points might define its colour. (The final thing would be called a Glyphs, whereas the concept of graphemes is abstract and similar to the notion in computing of a character. By comparison, a specific shape that represents any particular grapheme in a given typeface is called a glyph).
    This largest number we might recognise in literature for the 4 digit Unicode UTF-16 2-Byte as looking something like
    U+FFFF
    In the literature, the number we considered, decimal 8230, would likely be given as something like U+2026
    Hello in Unicode, corresponds to these five code points:
    U+0048 U+0065 U+006C U+006C U+006F.
    This is still just a bunch of code points. Numbers, really. We haven’t yet said anything about how to store this in memory
    That’s where encodings come/ came in.
    The first idea was based on 2 Bytes, but as we have noted other encodings are available.
    The Single Most Important Fact About Encodings - . It does not make sense to have a string without knowing what encoding it uses.
    Generical Term(s), mismomers and the such __ W
    We are predominantly involved with Microsoft stuff, and whether "ANSI" or "UNICODE", the terms are and likely will always be used loosely and mostly technically incorrectly. With Unicode the misuse is less from historical mismomers, and more often the word Unicode is misused when referring to the Unicode Encoding used. Microsoft are less precise in things Unicode, often having a general term W for wide when distinguishing from "ANSI". This is of course not without its confusion in its broader sense due to the UTF-8 encoding being as "wide" as "ANSI". However within Microsoft itself the "W" version usually refers to their UTF-16.
    We will try in the next post to makes some definitions to help perpetuate and encourage the awareness of the naming impreciseness

    Ref

    https://www.joelonsoftware.com/2003/...ts-no-excuses/
    https://web.archive.org/web/20230321...p?f=30&t=38460 , https://web.archive.org/web/20230321...38460&start=20
    https://decodeunicode.org/en/u+10000 , https://eileenslounge.com/viewtopic....297518#p297518
    https://web.archive.org/web/20201201...t/tips/varptr/
    Last edited by DocAElstein; Yesterday at 11:03 PM.

  6. #16
    Fuhrer, Vierte Reich DocAElstein's Avatar
    Join Date
    Aug 2014
    Posts
    9,468
    Rep Power
    10

    Unicode (WUnicorn) and (Microsoft) "ANSI" (AASI)

    Generical Termsinology relating to computer storage of characters
    Xmas New Year 2024 2025

    Character sets
    The last few posts demonstrate clearly that there is plenty of scope for misuse, poor use, misunderstanding, mismomers, etc., in the use of the terms related to how computers handle characters in memory. It is difficult to move forward in discussions if one even tries to be more accurate and precise, since the extra words piss some people off so much that they at best don't want to read further, and at worst want to kill you.
    I would suggest the best compromise would be to have some general terms to help give at least some awareness of the more accurate issues behind, historical and otherwise. These terms can therefore be referenced here for a better understanding and better advancement of mankind.

    ** AASI
    AASI is equivalent to "ANSI", where "ANSI" would be likely to be used under the correct historical reference misnomer Perpetration, and this new term AASI could be thought of as a generical term for what people are most likely be referring to as goings-on in a computer either with, or arising from, character processes involved mostly with the first typical 256 (0-255) characters. More likely in any conversation, we would be more interested in, or we would be interested in differences in, the second half.

    ** WUnicorn
    This will be used as a general term for all things "Unicode" or Unicode but centred around, or with more emphasis on, either the "Wide" equivalent of an "ANSI thing" and/or the typical Microsoft UTF-16 (LE) Unicode Encoding


    It would be highly recommended if landing here to briefly read the above posts on this page 2
    https://www.excelfox.com/forum/showt...age2#post17877
    https://www.excelfox.com/forum/showt...age2#post17878
    https://www.excelfox.com/forum/showt...age2#post17879
    https://www.excelfox.com/forum/showt...age2#post17880
    https://www.excelfox.com/forum/showt...age2#post24946





    Generical Termsinology 4 experiment types
    Terminology used in discussing experiments centred around VB Strings , in particular when investigating the string parameters in VB(A) win32 API functions
    Just the basic 4 forms of VB(A) win32 API functions are detailed here. The significance is the main subject of most of the musings around the last dozen or two postings here.

    Straight AASI
    The Declareing line is used in the form most often given in VB or VBA literature whereby :
    _ string parameters are given As String. ( Further more we note that most often the full parameter would read
    ByVal MyStrvariable As String
    , but not exclusively so, - there may occasionally be a ByRef instead
    .)
    _ Most typically the win 32 API function given will have a trailing A in it's name, pseudo like MyWin32APIFunctionA

    "Half way house" AASI (HWH ASII)
    ( The terminology arises here from a knowledge of the typical solution that almost always works to get over problems where characters, predominantly those with higher code points, ( > 255 ), may somehow give problems. This solution, just very briefly given here, involves usually 2 adjustments to the Straight AASI
    _ In the Declareing line, the typically given As String
    is replaced by As Lo_____, where Lo_____ may be a Long type such as Long or LongPtr )
    The "Half way house" AASI (HWH ASII) replaces the string parameters given As String with As Long or As LongPtr

    Full WUnicorn
    The two main characteristics of this solution is
    _The (Microsoft) "W" version of a Win32 API Function, which most usually is available is used. This usually looks similar to the AASI version, but with a trailing W in place of the trailing A , pseudo like MyWin32APIFunctionW (This is often referred to under imprecise approximate mismomer convention as the Unicode version when distinguishing or in comparison speaking using the full historical mismomer reference "ANSI" or ANSI for the "A" version
    _ Any string parameters given by As String are replaced with As Long or As LongPtr

    Half way house WUnicorn (HWHWU)
    This is the Full WUnicorn version but with any string parameters As String









    ** Termsinology Ratified by order of
    Alan
    Hof
    Xmas / New Year, 2024 2025
    Last edited by DocAElstein; 01-22-2025 at 07:55 PM.

  7. #17
    Fuhrer, Vierte Reich DocAElstein's Avatar
    Join Date
    Aug 2014
    Posts
    9,468
    Rep Power
    10

    Unicode (WUnicorn) and (Microsoft) "ANSI" (AASI)

    Chr ( x ) , x = 0 to 255 and the common low end AASI character tables.
    (Window Code pages)
    aka Interactions with AASI and WUnicorns at the bottom bit
    The previous posts have shown and discussed that we have these two things to consider, AASI and WUnicorns.
    Further more, the cross over point and associated interactions with the two at the bottom we will go on to see are not so well defined and a bit blurred. Here we will discuss some background issues to help make things clearer later.
    The term "Unicode character" is quite a correct and well defined term, at least to some extent, since Unicode can be used as a term related to a single unique list of characters: A single unique list of characters that is pretty dammed massive already and likely to get bigger as long as man exists on this planet. We are mainly concerning ourselves with the 0 – 65535 range
    OK so that is WUnicorn, as I call it. On the other AASI end, as we have discussed many times, a term such as "ANSI character" is a slightly more vague historical mismomer thing: It does give some indication of what we are likely to be talking about, and that may in end effect be a character list involving characters mostly "down the bottom" around the 0 – 255 Unicode character list area, but it might have the odd few characters in the Unicode character list up to about 400, and similarly may be missing a few under 256 in the Unicode character list.
    Let's go through that in a bit more detail

    Get your bottom end Chr(x) List
    First things first. Make sure you know what characters your current computer has in its AASI character list. It will be similar to ChrW(x) , with x = 0 to 255, but unlikely exactly the same and this might cause awkward issues later if you are unaware of the differences, so it is a good idea to get this list at an early stage. Remember also that it may be slightly different list for different computers.
    The simplest way, Rem 1 in the coding below, is to get this list is to simply loop for x = 0 to 255 and paste out Chr(x) as in the next simply coding. While we are at it we will get our windows code page number then go on to check our list with any published table for that windows code page number: Remember you may get slightly different results on your computer
    In Rem 2 I obtain the windows code page. So far I have seen 1250 and 1252. I researched the internet using those numbers and obtained the appropriate lists which amongst other things are included the table examples below
    Code:
    Option Explicit ' https://eileenslounge.com/viewtopic.php?p=324440#p324440
    Private Declare Function GetACP Lib "kernel32" () As Long
    Sub GetMyBottomEndAASIs() ' https://www.excelfox.com/forum/showthread.php/2824-Tests-Copying-Pasting-API-Cliipboard-issues-and-Rough-notes-on-Advanced-API-stuff?p=24947&viewfull=1#post24947
    Rem 1 make my own list
    Dim Ex As Long
        For Ex = 0 To 255
         Let Range("H" & Ex + 4 & "") = Ex
         Let Range("I" & Ex + 4 & "") = Chr(Ex)
        Next Ex
                                         Let Rows("14:14").WrapText = False: Debug.Print Asc(Range("I" & 14 & "").Value)    '    This line is a small bodge to remove any automatic wrap text feature that may adjust the cell height seen when it recieves the line feed character  (  Chr(10)  vbLf  )     :     The  Debug.Print  is a quick check  to make sure we still have the character there
    Rem 2 get the windows code page number     ( ' https://eileenslounge.com/viewtopic.php?p=324443#p324443 )
    Dim WindowsCodePageNumber As Long
     Let WindowsCodePageNumber = GetACP()
    Debug.Print WindowsCodePageNumber ' so far I have seen   1250    and the most common one which is   1252
    End Sub
    


    Here I am just collecting a few lists of Chr(x) , with x from 0 to 255 , and typically alongside I will have a code page list. The significance has been touched on above, and will be discussed further later. Any reference names are for me to tie up list to some of my computers

    ' https://i.postimg.cc/Kj0YvV3Z/Windows-CPage-KB1252.jpg
    ' https://i.postimg.cc/BZVgVzkc/Window...ungary1250.jpg


    Code:
     
    Central Europe    (my Hungary  SSD2)	 		More typical (Western Europe)
    x	Chr(x)	Win1250	MS-DOS852	MS-DOS852		x	Chr(x)	Win1252	MS-DOS850
    0		␀		 		0		NUL	 
    1	 	␁		 		1	 	SOH	 
    2	 	␂		 		2	 	STX	 
    3	 	␃		 		3	 	ETX	 
    4	 	␄		 		4	 	EOT	 
    5	 	␅		 		5	 	ENQ	 
    6	 	␆		 		6	 	ACK	 
    7	 	␇		 		7	 	BEL	 
    8	 	␈		 		8	 	BS	 
    9		␉		 		9		HT	 
    10		␊		 		10		LF	 
    11		␋		 		11		VT	 
    12	 	␌		 		12	 	FF	 
    13		␍		 		13		CR	 
    14		␎		 		14		SO	 
    15	 	␏		 		15	 	SI	 
    16	 	␐		 		16	 	DLE	 
    17	 	␑		 		17	 	DC1	 
    18	 	␒		 		18	 	DC2	 
    19	 	␓		 		19	 	DC3	 
    20	 	␔		 		20	 	DC4	 
    21	 	␕		 		21	 	NAK	 
    22	 	␖		 		22	 	SYN	 
    23	 	␗		 		23	 	ETB	 
    24	 	␘		 		24	 	CAN	 
    25	 	␙		 		25	 	EM	 
    26	 	␚		 		26	 	SUB	 
    27	 	␛		 		27	 	ESC	 
    28	 	␜		 		28	 	FS	 
    29	 	␝		 		29	 	GS	 
    30	 	␞		 		30	 	RS	 
    31	¬	␟		 		31	¬	US	 
    32	 	␠		 		32	 	SP	 
    33	!	!	!	 		33	!	!	!
    34	"	"	"	 		34	"	"	"
    35	#	#	#	 		35	#	#	#
    36	$	$	$	 		36	$	$	$
    37	%	%	%	 		37	%	%	%
    38	&	&	&	 		38	&	&	&
    39		'	\'	 		39		'	\
    40	(	(	(	 		40	(	(	(
    41	)	)	)	 		41	)	)	)
    42	*	*	*	 		42	*	*	*
    43	+	+	+	 		43	+	+	+
    44	,	,	,	 		44	,	,	,
    45	-	-	-	 		45	-	-	-
    46	.	.	.	 		46	.	.	.
    47	/	/	/	 		47	/	/	/
    48	0	0	0	 		48	0	0	0
    49	1	1	1	 		49	1	1	1
    50	2	2	2	 		50	2	2	2
    51	3	3	3	 		51	3	3	3
    52	4	4	4	 		52	4	4	4
    53	5	5	5	 		53	5	5	5
    54	6	6	6	 		54	6	6	6
    55	7	7	7	 		55	7	7	7
    56	8	8	8	 		56	8	8	8
    57	9	9	9	 		57	9	9	9
    58	:	:	:	 		58	:	:	:
    59	;	;	;	 		59	;	;	;
    60	<	<	<	 		60	<	<	<
    61	=	=	=	 		61	=	=	=
    62	>	>	>	 		62	>	>	>
    63	?	?	?	 		63	?	?	?
    64	@	@	@	 		64	@	@	@
    65	A	A	A	 		65	A	A	A
    66	B	B	B	 		66	B	B	B
    67	C	C	C	 		67	C	C	C
    68	D	D	D	 		68	D	D	D
    69	E	E	E	 		69	E	E	E
    70	F	F	F	 		70	F	F	F
    71	G	G	G	 		71	G	G	G
    72	H	H	H	 		72	H	H	H
    73	I	I	I	 		73	I	I	I
    74	J	J	J	 		74	J	J	J
    75	K	K	K	 		75	K	K	K
    76	L	L	L	 		76	L	L	L
    77	M	M	M	 		77	M	M	M
    78	N	N	N	 		78	N	N	N
    79	O	O	O	 		79	O	O	O
    80	P	P	P	 		80	P	P	P
    81	Q	Q	Q	 		81	Q	Q	Q
    82	R	R	R	 		82	R	R	R
    83	S	S	S	 		83	S	S	S
    84	T	T	T	 		84	T	T	T
    85	U	U	U	 		85	U	U	U
    86	V	V	V	 		86	V	V	V
    87	W	W	W	 		87	W	W	W
    88	X	X	X	 		88	X	X	X
    89	Y	Y	Y	 		89	Y	Y	Y
    90	Z	Z	Z	 		90	Z	Z	Z
    91	[	[	[	 		91	[	[	[
    92	\	\	\	 		92	\	\	\
    93	]	]	]	 		93	]	]	]
    94	^	^	^	 		94	^	^	^
    95	_	_	_	 		95	_	_	_
    96	`	`	`	 		96	`	`	`
    97	a	a	a	 		97	a	a	a
    98	b	b	b	 		98	b	b	b
    99	c	c	c	 		99	c	c	c
    100	d	d	d	 		100	d	d	d
    101	e	e	e	 		101	e	e	e
    102	f	f	f	 		102	f	f	f
    103	g	g	g	 		103	g	g	g
    104	h	h	h	 		104	h	h	h
    105	i	i	i	 		105	i	i	i
    106	j	j	j	 		106	j	j	j
    107	k	k	k	 		107	k	k	k
    108	l	l	l	 		108	l	l	l
    109	m	m	m	 		109	m	m	m
    110	n	n	n	 		110	n	n	n
    111	o	o	o	 		111	o	o	o
    112	p	p	p	 		112	p	p	p
    113	q	q	q	 		113	q	q	q
    114	r	r	r	 		114	r	r	r
    115	s	s	s	 		115	s	s	s
    116	t	t	t	 		116	t	t	t
    117	u	u	u	 		117	u	u	u
    118	v	v	v	 		118	v	v	v
    119	w	w	w	 		119	w	w	w
    120	x	x	x	 		120	x	x	x
    121	y	y	y	 		121	y	y	y
    122	z	z	z	 		122	z	z	z
    123	{	{	{	 		123	{	{	{
    124	|	|	|	 		124	|	|	|
    125	}	}	}	 		125	}	}	}
    126	~	~	~	 		126	~	~	~
    127		␡		 		127		DEL	 
    128	€	€	Ç	€		128	€	€	Ç
    129			ü			129			ü
    130	‚	‚	é	‚		130	‚	‚	é
    131	ƒ		â	ƒ		131	ƒ	ƒ	â
    132	„	„	ä	„		132	„	„	ä
    133	…	…	ů	…		133	…	…	à
    134	†	†	ć	†		134	†	†	å
    135	‡	‡	ç	‡		135	‡	‡	ç
    136	ˆ		ł	ˆ		136	ˆ	ˆ	ê
    137	‰	‰	ë	‰		137	‰	‰	ë
    138	Š	Š	Ő	Š		138	Š	Š	è
    139	‹	‹	ő	‹		139	‹	‹	ï
    140	Ś	Ś	î	Œ		140	Œ	Œ	î
    141	Ť	Ť	Ź			141			ì
    142	Ž	Ž	Ä	Ž		142	Ž	Ž	Ä
    143	Ź	Ź	Ć			143			Å
    144			É			144			É
    145	‘	‘	Ĺ	‘		145	‘	‘	æ
    146	’	’	ĺ	’		146	’	’	Æ
    147	“	“	ô	“		147	“	“	ô
    148	”	”	ö	”		148	”	”	ö
    149	•	•	Ľ	•		149	•	•	ò
    150	–	–	ľ	–		150	–	–	û
    151	—	—	Ś	—		151	—	—	ù
    152	˜		ś	˜		152	˜	˜	ÿ
    153	™	™	Ö	™		153	™	™	Ö
    154	š	š	Ü	š		154	š	š	Ü
    155	›	›	Ť	›		155	›	›	ø
    156	ś	ś	ť	œ		156	œ	œ	£
    157	ť	ť	Ł			157			Ø
    158	ž	ž	×	ž		158	ž	ž	×
    159	ź	ź	č	Ÿ		159	Ÿ	Ÿ	ƒ
    160	 		á	 		160	 	NBSP	á
    161	ˇ	ˇ	í	¡		161	¡	¡	í
    162	˘	˘	ó	¢		162	¢	¢	ó
    163	Ł	Ł	ú	£		163	£	£	ú
    164	¤	¤	Ą	¤		164	¤	¤	ñ
    165	Ą	Ą	ą	¥		165	¥	¥	Ñ
    166	¦	¦	Ž	¦		166	¦	¦	ª
    167	§	§	ž	§		167	§	§	º
    168	¨	¨	Ę	¨		168	¨	¨	¿
    169	©	©	ę	©		169	©	©	®
    170	Ş	Ş	¬	ª		170	ª	ª	¬
    171	«	«	ź	«		171	«	«	½
    172	¬	¬	Č	¬		172	¬	¬	¼
    173	¬	¬	ş	¬		173	¬	¬SHY	¡
    174	®	®	«	®		174	®	®	«
    175	Ż	Ż	»	¯		175	¯	¯	»
    176	°	°	░	°		176	°	°	░
    177	±	±	▒	±		177	±	±	▒
    178	˛	˛	▓	²		178	²	²	▓
    179	ł	ł	│	³		179	³	³	│
    180	´	´	┤	´		180	´	´	┤
    181	µ	µ	Á	µ		181	µ	µ	Á
    182	¶	¶	Â	¶		182	¶	¶	Â
    183	•	•	Ě	•		183	•	•	À
    184	¸	¸	Ş	¸		184	¸	¸	©
    185	ą	ą	╣	¹		185	¹	¹	╣
    186	ş	ş	║	º		186	º	º	║
    187	»	»	╗	»		187	»	»	╗
    188	Ľ	Ľ	╝	¼		188	¼	¼	╝
    189	˝	˝	Ż	½		189	½	½	¢
    190	ľ	ľ	ż	¾		190	¾	¾	¥
    191	ż	ż	┐	¿		191	¿	¿	┐
    192	Ŕ	Ŕ	└	À		192	À	À	└
    193	Á	Á	┴	Á		193	Á	Á	┴
    194	Â	Â	┬	Â		194	Â	Â	┬
    195	Ă	Ă	├	Ã		195	Ã	Ã	├
    196	Ä	Ä	─	Ä		196	Ä	Ä	─
    197	Ĺ	Ĺ	┼	Å		197	Å	Å	┼
    198	Ć	Ć	Ă	Æ		198	Æ	Æ	ã
    199	Ç	Ç	ă	Ç		199	Ç	Ç	Ã
    200	Č	Č	╚	È		200	È	È	╚
    201	É	É	╔	É		201	É	É	╔
    202	Ę	Ę	╩	Ê		202	Ê	Ê	╩
    203	Ë	Ë	╦	Ë		203	Ë	Ë	╦
    204	Ě	Ě	╠	Ì		204	Ì	Ì	╠
    205	Í	Í	═	Í		205	Í	Í	═
    206	Î	Î	╬	Î		206	Î	Î	╬
    207	Ď	Ď	¤	Ï		207	Ï	Ï	¤
    208	Đ	Đ	đ	Ð		208	Ð	Ð	ð
    209	Ń	Ń	Đ	Ñ		209	Ñ	Ñ	Ð
    210	Ň	Ň	Ď	Ò		210	Ò	Ò	Ê
    211	Ó	Ó	Ë	Ó		211	Ó	Ó	Ë
    212	Ô	Ô	ď	Ô		212	Ô	Ô	È
    213	Ő	Ő	Ň	Õ		213	Õ	Õ	ı
    214	Ö	Ö	Í	Ö		214	Ö	Ö	Í
    215	×	×	Î	×		215	×	×	Î
    216	Ř	Ř	ě	Ø		216	Ø	Ø	Ï
    217	Ů	Ů	┘	Ù		217	Ù	Ù	┘
    218	Ú	Ú	┌	Ú		218	Ú	Ú	┌
    219	Ű	Ű	█	Û		219	Û	Û	█
    220	Ü	Ü	▄	Ü		220	Ü	Ü	▄
    221	Ý	Ý	Ţ	Ý		221	Ý	Ý	¦
    222	Ţ	Ţ	Ů	Þ		222	Þ	Þ	Ì
    223	ß	ß	▀	ß		223	ß	ß	▀
    224	ŕ	ŕ	Ó	à		224	à	à	Ó
    225	á	á	ß	á		225	á	á	ß
    226	â	â	Ô	â		226	â	â	Ô
    227	ă	ă	Ń	ã		227	ã	ã	Ò
    228	ä	ä	ń	ä		228	ä	ä	õ
    229	ĺ	ĺ	ň	å		229	å	å	Õ
    230	ć	ć	Š	æ		230	æ	æ	µ
    231	ç	ç	š	ç		231	ç	ç	þ
    232	č	č	Ŕ	è		232	è	è	Þ
    233	é	é	Ú	é		233	é	é	Ú
    234	ę	ę	ŕ	ê		234	ê	ê	Û
    235	ë	ë	Ű	ë		235	ë	ë	Ù
    236	ě	ě	ý	ì		236	ì	ì	ý
    237	í	í	Ý	í		237	í	í	Ý
    238	î	î	ţ	î		238	î	î	¯
    239	ď	ď	´	ï		239	ï	ï	´
    240	đ	đ	¬	ð		240	ð	ð	¬
    241	ń	ń	˝	ñ		241	ñ	ñ	±
    242	ň	ň	˛	ò		242	ò	ò	‗
    243	ó	ó	ˇ	ó		243	ó	ó	¾
    244	ô	ô	˘	ô		244	ô	ô	¶
    245	ő	ő	§	õ		245	õ	õ	§
    246	ö	ö	÷	ö		246	ö	ö	÷
    247	÷	÷	¸	÷		247	÷	÷	¸
    248	ř	ř	°	ø		248	ø	ø	°
    249	ů	ů	¨	ù		249	ù	ù	¨
    250	ú	ú	˙	ú		250	ú	ú	•
    251	ű	ű	ű	û		251	û	û	¹
    252	ü	ü	Ř	ü		252	ü	ü	³
    253	ý	ý	ř	ý		253	ý	ý	²
    254	ţ	ţ	■	þ		254	þ	þ	■
    255	˙	˙	 	ÿ		255	ÿ	ÿ	 
    
     












    ISO-8859-1 :- The main difference to Windows 1252 seems to be that 128 to 159 are not used in ISO-8859-1
    Last edited by DocAElstein; 02-04-2025 at 01:06 AM.

  8. #18
    Fuhrer, Vierte Reich DocAElstein's Avatar
    Join Date
    Aug 2014
    Posts
    9,468
    Rep Power
    10
    VBA and VB variable addresses and pointers.

    This does not follow the classical listing and brief description of variable types. (That can for example been found here https://www.excelfox.com/forum/showt...ll=1#post23882 )
    We are more concerned here with how the variable is held in memory, and the concepts centred around the pointer idea
    Pointer/ Variable object constants.
    Pointer

    Let us say first the very very basic simplest description of a computer pointer, ( hoping that no professional computer exert reads it and then want to kill me for my naivety): A pointer can be thought of as a variable , ( which itself is a computer memory chunk held somewhere somehow) and that itself contains not the final value, but some information allowing you to get at the actual value.
    This pointer idea seems a strange concept initially, but it grows on you, and after a bit of thinking makes some sense :
    A simple idea of a variable holding a value or even of it directly giving you the address / location is a bit of a hap hazard not particularly ordered way if you think about it. A simplified idea of the variable saying where or naming the memory place of a value is, is an OK idea to explain you storing some things in a shoe box, but it is unlikely to be an efficient way to organise something that is not a human brain but a mesh of simple 0s and 1s

    Variable object constants
    The ordered / efficient way involves processes requiring a very high low level computer knowledge. We might attribute some things to a variable, so we are in the realms of vague Object Orientated Programming property concepts and so, maybe not surprisingly, we end up calling a variable an object, although what the object is blurred often and we might call the pointer the object, whereas the object might be regarded of the pointer class. Here is a good one I just thought of: What a variable might be considered to actually be is an object of a pointer class , or the word pointer might be considered the name of the object , that being the object we perhaps do not name or refer to so well by use of the word variable. The word variable is more like a convenient term of reference to relate something to a value being , or a value to be, held. But this misleads a bit since the over simplified idea of the variable saying where or naming the memory place of a value is an OK idea to explain you storing some things in a shoe box, but it is totally inappropriate to the more sane and efficiently organised ides starting at the COFF symbol table and extending into the pointer(s)
    These object ideas come about as we end up attributing things to them so then they have properties

    I will use the examples of a Long and a String
    Starting at the bottom with a simple variable, Long , as it then leads on nicely to the slightly more complex String

    ( A Tool, VarPtr
    For the time being we will just accept that this is a function, ( ** which coincidently has no documentation ), which gives us the pointer value, (the address it is intended to "point" to), or the number value actually held at that address. ( ** My best guess is that it is some API function that is made available to us in VB(A) without needing a Declareing
    ) )

    Long
    Skematic VBA Long.jpg
    Code:
    Sub  VBALongTypeMemoryStuff()
    1 Dim Lng As Long
    2 Debug.Print VarPtr(Lng)        '  2355700    I don't know where this number is held in memory. I don't care
    3 Debug.Print VarPtr(ByVal Lng)  '  0
    4 Debug.Print Lng                '  0
     
     Let Lng = 2
    
    5 Debug.Print VarPtr(Lng)          '  2355700
    6 Debug.Print VarPtr(ByVal Lng)    '  2
    
    8 Debug.Print Lng                  '  2
    End Sub
    (** Pointer values are not unique, neither across different computers or in the same computer and software at different times. The method used to allocate them may result coincidentally occasionally in getting the same number for a short time period, for re runs of the same simple coding on the same computer )
    Starting at the very bottom, and from after line 1 of the coding, a number, that usually referred to pointer , will be stored somehow, somewhere in the computer. The actual location of this requires an in depth low level knowledge of which is perhaps really too complicated and of much too little importance for us to consider. Surfice to say it goes by the name of Common Object File Format ( COFF ) symbol table, and could perhaps be by a layman be regarded as some sort of stack or shelf arrangement populated by some complex rules allowing software to access as necessary. Our interest starts as what is "in" this , and perhaps some knowledge of its size/ construction would not go a miss.
    It is a number which is made with 4 bytes, (32 bits). It is called a Pointer, often. The actual number refers to the first memory address ( the first memory address at the left hand side) of 4 Bytes (32 bits) set aside in memory to hold the final value which I later assign to the variable, (with Let Lng = 2 ). In other words, the act of doing Dim Lng As Long reserves for me 4 sequential memory addresses/ byte locations, in a sequential row as it were. And the first one, the first byte, at the left as it were, has the memory location got for me with the second code line, Debug.Print VarPtr(Lng) ( by me , I got the value shown in the ' comment - 2355700** )
    The third line, Debug.Print VarPtr(ByVal Lng) , is perhaps giving me the same as the 4th line, which is the value of the variable. For the case of a Long Type it does have a value even before I assign one. It has the value 0. If I assigned it 0 with Let Lng = 0 nothing would change anywhere, (before I do Let Lng = 2 ). What I would have possibly done is changed the value of 0 with 0
    In line 5 , after, Let Lng = 2 , the memory location , 2355700, has not changed, as it never does for the VBA Long type, even if I assign a much different number. This is because It doesn’t need to change, because 32 bits are enough to hold a binary value of any number in the range of the defined number range for a VBA Long type
    Line 6 and line 8: My results suggest that Debug.Print VarPtr(ByVal Lng) and Debug.Print Lng are giving me the same thing – the value it sees at the address, 2355700**.
    ( **The address I got was 2355700 , you will get a similar but different number. )
    In this situation what is actually the "Pointer" is a slightly vague concept you would use to refer to one or more of those things or all of them depending on the context in which you use it




    String experiments in the next post.
    Last edited by DocAElstein; 01-25-2025 at 03:47 PM.

  9. #19
    Fuhrer, Vierte Reich DocAElstein's Avatar
    Join Date
    Aug 2014
    Posts
    9,468
    Rep Power
    10
    String experiments note:
    Our main interest in this Thread is the behaviour of the string, but the experiments with the Long above where expected to fit in nicely before considering the string, as they did and do, but also an unexpected bonus was noticed. To experience this Bonus, it would likely be necessary to run the coding in the last post again, immediately before the coding in this post

    String
    Skematic VBA String.jpg
    String memory handling generally in computing, is a bit more dynamic/ complex than something like for a simple number, since even with modern computers, we don’t want to go around reserving a Giga Byte or two of memory every time we declare a string variable.
    Here is a coding similar to the last
    Code:
    Sub ConfusedWithVBAandVBStringPointerStuff()
    1 Dim Str As String
    2 Debug.Print VarPtr(Str)        '  2355700   I don't know where this number is held in memory. I don't care. 
    3 Debug.Print VarPtr(ByVal Str)              ' 0
    4 Debug.Print StrPtr(ByVal Str); StrPtr(Str) '  0     0
    
     Let Str = "ABCD"
    5 Debug.Print VarPtr(Str)                                      '  2355700 - makes sense - no reason for this address to change. It is the first Byte of the 4 bytes that holds the VB pointer/ address, whatever value that is
    6 Debug.Print VarPtr(ByVal Str)               '  4028444
    7 Debug.Print StrPtr(ByVal  Str); StrPtr(Str)  '  4028444  4028444
    
    8 Debug.Print Str
    End Sub
    One immediate unexpected interesting thing is the second code line Debug.Print VarPtr(Str) actually returned me the very same number as the second code line in the coding from the last post, Debug.Print VarPtr(Lng)!! I do realise that will not always be the case, but as I ran the coding shortly after running the previous coding on the same computer, then there is a chance it will be the same, as it was, … because….. how about this: What Dim Str As String in the coding below is doing, is very similar to what the Dim Lng As Long did/does in the coding from the last post, - it sets aside once again 4 Bytes (32 bits) for me. But the difference being now is that it is not reserving me a place to store any final number or character values, rather it is reserving me a place … for …. a …. 32Bit Pointer , specifically a pointer to a VB String
    So I could say, what I have is a VBA Pointer to a VB Pointer, or perhaps a VBA Pointer to what likely will be a VB Pointer – depending on how you feel the word pointer should be used.
    At this stage, after Dim Str As String, and before Let Str = "ABCD", we are at a very similar to the second ( and final ) level in the last diagram, ( the main difference is that we have a value of 0 now instead of the value 2). In the previous diagram, the assignment Let Lng = 2 did not change much other than replacinf a value of zero with a value of 2. In the now case of a string, the Let Str = "ABCD" also does a simple change of value ( from 0 to 4028444 ) but that is not all: We are not at the final level in this case, (and the number that would be finally returned from VarPtr(ByVal Str after we assign the value is not the final, string value , but rather will be the address of the final string, 4028444. (But note also that address is not of the start memory point I finally use for my string, but rather 4 Bytes along where the actual Unicode UTF-16 LE encoding Bytes start. )
    At the point before Let Str = "ABCD" I have no memory at all set aside for any final string value. In crude terms, the final top section from the diagram below does not exist, ( instead I have this situation https://i.postimg.cc/rwGWm2F8/vb-Null-String.jpg ), and the value at memory location 2355700 resembles very similar the situation at code line 4 of the previous coding, having a value of 0. (At this point I could regard this situation as a "vbNullString situation" ## , but that is just something I came up with to help me remember what vbNullString is about )
    After Let Str = "ABCD", I have the situation depicted in the diagram below.
    As already noted, the final value returned from VarPtr(ByVal Str after we assign the value is not the final, string value , but rather will be the memory address location of the final string.
    (##Note in passing that after Let Str = "ABCD" the further thereafter use of Let Str = vbNullString allows us to return to the situation at code line 4)
    As for code line 8, I would assume that VB(A) either knows how to navigate along the 2 pointers to get the final value, such as for example going further until it no longer sees a points or some similar innards coding.
    The sketch below illustrates the situation after Let Str = "ABCD" , (which remains the situation until the coding ends, since the following lines change nothing: they just get us some information about the situation.)



    StrPtr v VarPtr
    The strPtr is an mysterious and undocumented as the VarPtr. It may have come at the point where changes involving when Unicorn came about. Some literature suggests this somehow makes sure we go to the memory area in which the Unicorn actual string is stored. Maybe that suggest is safer to use than VarPtr(ByVal Str

    vbNullString v ""
    I am including this bit of extra info here as it is perhaps easy to understand after the previous stuff, and also it can help consolidate the previous knowledge
    The following coding is only slightly different from the previous. The first change is to replace Let Str = "ABCD" with Let Str = ""
    Code:
    Sub VBAandVBStringPointerStuffandvbNullString() '  https://www.excelfox.com/forum/showthread.php/2824-Tests-Copying-Pasting-API-Cliipboard-issues-and-Rough-notes-on-Advanced-API-stuff?p=24948&viewfull=1#post24948
    1 Dim strOb As String
    2 Debug.Print VarPtr(strOb)        '  2355700   I don't know where this number is held in memory. I don't care
    3 Debug.Print VarPtr(ByVal strOb)              ' 0
    4 Debug.Print StrPtr(ByVal strOb); StrPtr(strOb) ' 0   0
    
     Let strOb = ""
    5 Debug.Print VarPtr(strOb)        '  2355700 - makes sense - no reason for this address to change. It is the first Byte of the 4 bytes that holds the VB pointer/ address, whatever value that is
    6 Debug.Print VarPtr(ByVal strOb)               '  4028444
    7 Debug.Print StrPtr(ByVal strOb); StrPtr(strOb)  '  4028444  4028444
    
    8 Debug.Print Len(strOb)           '   0
     
    9 Let strOb = vbNullString
    10 Debug.Print VarPtr(strOb)        '  2355700   I don't know where this number is held in memory. I don't care
    11 Debug.Print VarPtr(ByVal strOb)              ' 0
    12 Debug.Print StrPtr(ByVal strOb); StrPtr(strOb) ' 0   0
    
    End Sub
    After this Let Str = "" we have a situation very similar to the situation after Let Str = "ABCD" in the previous coding. The only difference is that the top chunk of memory has changed to this:

    https://i.postimg.cc/rm68X18P/Zero-length-string.jpg

    Effectively the final place we "point" to is the representation of the terminating null character (vbNullChar = Chr(0)) which indicates/ causes the end of any string. So we end where we start, and have a length of zero

    After code line 9, the use of vbNullString has brought us back to the situation when we have only done this so far Dim strOb As String , and not gone as far to do anything like Let Str = "" or Let Str = "ABCD", which pictorially can be represented by
    https://i.postimg.cc/rwGWm2F8/vb-Null-String.jpg


    It should be noted that vbNullString was introduced for API things around the VB4 - VB5 time , and there is no way in VB(A) to tell the difference other than in the pointer ways as done in the coding above.


    Ref: VarPtr , StrPtr stuff https://classicvb.net/tips/varptr/ , https://www.vba-tutorial.de/referenz/zeiger.htm
    https://www.aivosto.com/articles/stringopt2.html
    vbNullString https://eileenslounge.com/viewtopic....323978#p323978
    Last edited by DocAElstein; 01-27-2025 at 10:38 PM.

  10. #20
    Fuhrer, Vierte Reich DocAElstein's Avatar
    Join Date
    Aug 2014
    Posts
    9,468
    Rep Power
    10
    Some further string Type Terminology often used in API related things
    Examples BSTR and LPWSTR

    BSTR v LPWSTR
    This post was partly done to help remind of the differences in two similar things, the BSTR and the LPWSTR, which I for one, have got a bit mixed up in when first learning VBA win32API stuff

    The last post explained some ideas about how variables and the storage of them are organised.
    Now we need to get some related and overlapping Terminology clear, at least once here, as the words crop up a lot. In themselves they are not so difficult to understand, but their similarity makes them easy to mix up leading to a lot of unnecessary confusion later.
    We are interested here in this post in getting clear 2 out of a few so called "Types" . These both relate to strings.
    The BSTR will often come up, and there can be a bit of difference of opinion as to what it is we are talking about. Very very approximately it tend to refer to the VB(A) String type, or a VB(A) String type variable, so pretty well the String as discussed in the last posts. Some smart people suggest Microsoft have their definitions of it a bit wonky and messed up. It is also sometimes referred to as pointer but less so than the other , the LPWSTR, but they are both very much to do with pointer ideas, and difficult to explain without some recourse to diagrams.
    For the BSTR, the more accurate technical definition, if not always the most commonly used, would be a pointer of a specific form, as shown in the diagram below in orange
    LPWSTR is one of a few similar, or at least similar looking, types, when talking about API things and string variables. They can crop up in other things, but we will restrict ourselves to how they crop up in API things and string variables. I choose the LPWSTR here as it one we are most often interested in, and also because it is the one most often confused with the BSTR. The latter is not surprising as they are very similar.

    If we had never seen or considered the last few posts, we probably would have had another simplified picture of the types Long and String.
    Having got those slightly more advanced ideas about how variables and the storage of them are organised, then this post just follows on, as more of the same , specifically leading on from the last String diagram. (Unfortunately the whole business of VBA Win32 API is a chicken and egg situation, making it difficult to give a simple clear introduction to anything. So there will be some things the following explanation which will be unknown and perhaps appear strange to a beginner. Please just try accept some things for now. There is nothing that will not, or has not, (or more likely has and will be a few times) explained in detail nearby)

    So the diagram and coding below attempts to make some sense of it all as clearly as possible. Starting at the bottom left, one might consider, as people often do, that strBSTR is the VBA variable or VBA pointer. It is strictly speaking the symbol for the pointer held in a stack of active variable that we don’t have easy access to, going by the name of Common Object File Format ( COFF ) symbol table, and could perhaps, by a layman, be regarded as some sort of stack or shelf arrangement populated by some complex rules allowing software to access as necessary. Our interest starts at what is "in" this, in other words the number held at this stack/shelf location . It is a number which is made with 4 bytes, (32 bits). It, or the mechanisms associated with it, is/are called a Pointer, often. At the point that this number is made of 32 Bits and is pointing to a similar sized number , we can start, or are on the boarder of what could be called the BSTR. You should be a bit confused at this point. The BSTR is number 32 Bits which is part of something else which includes the final string of interest to us. The final thing and some other details about it go under the type definition of a BSTR. Without a diagram , and perhaps also some coding, any explanation is incomplete, in my opinion.
    The Dim strBSTR As String got us this far, and Debug.Print VarPtr(strBSTR) has just told us about it.
    So at this point we have not really started talking about the BSTR but have mentioned it around the edge, and not really talked much about the LPWSTR, which comes in from another direction as it were….
    At this particular stage, the second box, the dark orange one, has a value of 0, as given by VarPtr(ByVal strBSTR).
    At this point, opinions vary as to whether we actually have anything to do with a BSTR, since generally a pointer has a value, (other than 0). This could be regarded as the vbNullString state. (Later in "more full" string states, the assignment strBSTR = vbNullString would bring us back to this state )
    https://i.postimg.cc/9MjdfkxX/vb-Null-String-state.jpg

    Code:
    Sub BSTR_LPWSTR() '
    Dim strBSTR As String, strNew As String, Boo As Boolean, pz1PWSTR As Long, pz2PWSTR As Long
    Debug.Print VarPtr(strBSTR) '        1831480        This could be regarded as getting me the variable, strBSTR. It is the symbol for the pointer stored on the COFF symbol table
    Debug.Print VarPtr(ByVal strBSTR) '     0     Our "BSTR Pointer"  is empty at this point
     
    The olive coloured arrow there might be considered the VBA pointer

    Achieving a BSTR
    In some other notes , here , we see that there is not much difference in the situation achieved after an assignment of strBSTR = "" or strBSTR = "A". Both make a similar significant difference to the computer memory allocation, and certainly at this point we have the BSTR in one form of another, depending on your viewpoint.


    You can see that as we add a character, we simply slip in another two Bytes of memory

    So I will use the simplest state first, the so called "zero length string" state, as this simply makes the diagram easier to follow
    So the code line Let strBSTR = "" would result is something like this
    https://i.postimg.cc/sXvMGKd8/Zero-l...-situation.jpg




    Similarly if we then went on to do Let strBSTR = "A" then the situation would change to something like this:
    https://i.postimg.cc/SR3zn3YG/Charac...STR-LPWSTR.jpg Character A BSTR LPWSTR.jpg [
    Code:
    Sub BSTR_LPWSTR() ' https://www.excelfox.com/forum/showthread.php/2824-Tests-Copying-Pasting-API-Cliipboard-issues-and-Rough-notes-on-Advanced-API-stuff?p=24943&viewfull=1#post24943
    Rem 1
    Dim strBSTR As String, strNew As String, Boo As Boolean, pz1PWSTR As Long, pz2PWSTR As Long
    '  "vbNullString  state"
    Debug.Print VarPtr(strBSTR) '        1831480     This could be regarded as getting me the variable, strBSTR. It is the symbol for the pointer stored on the COFF symbol table
    Debug.Print VarPtr(ByVal strBSTR) '      0       Our Pointer is empty at this point
      
    '  "Zero length string state"
     Let strBSTR = ""
    Debug.Print VarPtr(strBSTR)      '   1831480     There is no reason for this to change
    Debug.Print VarPtr(ByVal strBSTR) '  195893860   We now have something significant that we can definitely relate to a string character storage
    '  "A" state
     Let strBSTR = "A"
    Debug.Print VarPtr(strBSTR)      '   1831480     There is no reason for this to change
    Debug.Print VarPtr(ByVal strBSTR) '  195894740   We now have something significant that we can definitely relate to a string character storage
    
    LPWSTR v BSTR
    With reference to the last diagram, we can see two common things, and both are important characteristics of these two data "Types" :
    _ They both "point" to the same place , a place related to the storage of a character string array in computer memory.
    _ They both have a terminating null character. (Important to note that this is that character 0, often seen in codings as something like vbNullChar or Chr(0), - Its decimal code point is 0 , the very first character typically in any computer character convention, and so consequently its hexadecimal or binary value is 0 also. But it is not the number character 0: The number character zero, that is to say the number before the number 1, has the decimal code point in most computer character list conventions of decimal 48, hexadecimal 30, binary 110000 , https://i.postimg.cc/vTwrfJ0N/Charac...STR-LPWSTR.jpg .
    In laymen terms "what they are" could be thought of as a number that specifically is the first address of a character array which has a terminating null character. That character array is what is likely to be seen/recognised when being "sent" or when "going" there
    Further in layman language, we could think of the BSTR as something we have or make in codings as a string variable, whereas, specific to our API stuff, the LPWSTR, (or a similar looking data type), is what we are likely to encounter in (specifically in more newer/modern) api function documentation, to tell us what a parameter we give it is to be related to. In other words, if the parameter type asked for in documentation is LPWSTR, then potentially it looks like giving it a BSTR could be OK.
    We can see from the sketch that the LPWSTR does not have any interest in the 4 Byte length indicator, but that is part of the BSTR definition/ description.
    A final difference in the two is related to the latter. A BSTR can have null characters (Chr(0) / vbNullCharacter) in any string, ( referred to as embedded nulls) , but you should not use them in a LPWSTR , since api things generally use that as its indication of the end of the string. (A BSTR has its length indication in the 4 Bytes at the start so it does not need the end null character indication
    As VBA is a high level language designed to hide us from many things, we rarely experience this null character. However when dealing with api strings, we may sometimes experience it, for example a string returned to us by an api function may have an extra null character on the end.


    Once you have taken all that in, and understood it all, you can work backwards to understand and use the typical documentation type definition. That is if any that have got it right. Something along the lines, for example, of that, a BSTR is a 32 bit number, but the number must be one that is the address of a null terminated Unicode character array preceded by a 4 Byte length field. But if that is all you can say, then you might understand it yourself but you are totally useless as a Teacher as you are totally hopeless at passing anything you know on

    Binary Numbers in Bytes, Little Endian, Little Indian backward byte Shuffles
    Long variables (4 bytes) are always represented in forward order on the Internet, or mostly anywhere, in the normal School maths way, whereas they are always in reverse byte order in memory. See here
    http://www.eileenslounge.com/viewtopic.php?f=30&t=41922
    https://www.vbforums.com/showthread....Reversing-long





    In Rem 2 of the final full coding ( https://www.excelfox.com/forum/showt...ge19#post24949 ) a short attempt is made to investigate the effect of an embedded null . Some things in the coding may not be understandable at this stage., but the general conclusions are
    ' 2a) VBA seems happy to deal with embedded nulls
    ' 2b) Some versions of an api trim function do not trim off a trailing space as they are se to do. This is probably because seeing the embedded null character "stopped" it looking further along the string and so never saw the space to be trimmed
    ' 2c) The VBA Trim function seems to work OK. Perhaps it has extra wiring to overcome this problem, ( I expect it may well use the api Trim , but has extra checks to work around the null character problem
    Last edited by DocAElstein; 02-12-2025 at 12:19 AM.

Similar Threads

  1. Replies: 21
    Last Post: 12-15-2024, 07:13 PM
  2. Replies: 114
    Last Post: 03-04-2024, 02:39 PM
  3. Replies: 42
    Last Post: 05-29-2023, 01:19 PM
  4. Some Date Notes and Tests
    By DocAElstein in forum Test Area
    Replies: 0
    Last Post: 11-23-2021, 10:40 PM
  5. Replies: 11
    Last Post: 10-13-2013, 10:53 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •