DWStrList

Started by Richard Kelly, May 06, 2026, 03:48:01 PM

Previous topic - Next topic

Richard Kelly

This old PowerBasic project I'm porting over involves creating a bunch of strings that get added together like:

sString = sString + sNewString

This can happen many times.

Would it be efficient to store each string in DWStrList and, when I need to full string, pull them from DWStrList one at a time and build the final DWSTRING?

My idea was to keep track of the total length of all the strings added and then use:

DWSTRING (BYVAL nChars AS LONG, BYREF wszFill AS CONST WSTRING)

to allocate a buffer one time to hold all the strings in DWStrList.

José Roca

#1
Not sure I fully understand what you intend to do, but here are the key points.

DWSTRING already uses a string‑builder strategy internally, so repeated concatenations like:

sString = sString + sNewString

are not expensive when done through DWSTRING. The class minimizes reallocations: as long as there is enough capacity in the internal buffer, it simply appends using wmemmove, which is extremely fast.

When the buffer runs out of space, DWSTRING allocates a new buffer with double the previous capacity. This amortizes the cost of growth.

If you know the approximate maximum size of the final string, you can eliminate reallocations entirely by setting the Capacity property:

DIM dws AS DWSTRING
dws.Capacity = 100000   ' // Set the initial capacity to 100000 characters
dws += "New string"
dws += "New string 2"
' ...

If your goal is to store strings for later reuse, you don't need DWStrList or linked lists. A simple array of DWSTRING is more efficient and more in line with modern memory behavior:

DIM rg(100) AS DWSTRING
rg(0) = "string 1"
rg(1) = "string 2"
' ...
DIM dws AS DWSTRING
dws += rg(0)
dws += rg(1)
-- or: dws += rg(0) + " " + rg(1)
print dws

Linked lists are a 1950s solution to a 1950s hardware problem. On modern CPUs, with deep caches and wide memory buses, they are usually worse than contiguous arrays because they destroy locality of reference.

Your best options today are:

DSafeArray, which is already optimized for unicode strings and variants.

DSafeArray.CreateVector, which allocates a single contiguous block and is the most efficient when you know the number of elements in advance.


Paul Squires

@Richard Kelly just an FYI, DWString and also built-in FB strings are much better at concatenations than PowerBasic's OLE strings. As José has pointed out, FB strings have a built in extra buffer that allows for better concatenation performance. That is why things like PB's StringBuilder is not overly worthwhile. I built a StringBuilder for FB during my early FB days and it essentially turned out to be a waste of time and actually performed slightly worse than simply adding strings together.
Paul Squires
PlanetSquires Software

José Roca

Even Bjarne Stroustrup, the designer of C++, discourages the use of linked lists. The reason is that they have performance issues like cache misses. Modern CPUs are fast, but memory access is relatively slow. Caches prefer to prefetch contiguous data (like arrays). Linked lists, with their scattered nodes, often cause cache misses. Memory overhead: Each node in a linked list requires extra memory to store pointers, making them less space-efficient than arrays for small data elements. No random access: Accessing the nth element in a linked list takes O(n) time, while arrays do this in O(1). I have added random access by making them indexed, but this adds overhead.


Richard Kelly

Good point guys. As I have already replaced all STRING with DWSTRING I'll just DWSTRING do its thing.