Null and empty
Passing a null string is very efficient. Whenever possible, use a null String instead of an empty String.
Each String type has a efficient constructor to initialize the string from a literal:
- String foo = ASCIILiteral("bar");
- String foo("bar", String::ConstructFromLiteral);
The safest option is ASCIILiteral("Foo"). It produces the same code size as String("Foo") while being faster.
The difference between the version is if the length of the string is included or not. Having the size given in the constructor makes the constructor faster. Having the size also makes the code bigger, which is a problem when the code is executed infrequently.
In general, use ASCIILiteral unless you can show improvement on a benchmark by using ConstructFromLiteral.
AtomicString from literal
AtomicString should always use the full template version:
- AtomicString foo("bar", AtomicString::ConstructFromLiteral);
The reason is that version gives the possibility to compute the hash at compile time in the future.
Not creating a string
Many operations can be more efficient with a literal. Do not create a String when it is not needed.
The first version is the fastest.
There are two efficient way to concatenate strings: StringBuilder and StringOperators. Anything else is pretty much less efficient when doing more than one operations.
str = text; str.append("a"); // == str.append(String("a")); str.append(foo); str += bar;
Should be (StringOperators):
str = text + 'a' + foo + bar;
Note the use of
'a' here instead of
"a" as it is more efficient.
str = foo; for (size_t i = 0; i < foobars; ++i) str += "bar";
StringBuilder builder; builder.append(foo); for (size_t i = 0; i < foobars; ++i) builder.appendLiteral("bar"); str = builder.toString();
Note: If you need to append a literal char,
builder.append('c'); is more efficient than
Any of the string class uses memory on the heap to allocate a StringImpl. The only way to avoid allocating new memory is to use the methods taking a constant literal.
On 64bits, the memory used for StringImpl vary between 28 bytes to (28 + length + length * 2) bytes for a string from copy that has been converted to 16 bits.
For example, a 10 characters string from copy converted to 16bits + the allocators alignment would typically take:
-28 + 10 = 38 -> typically allocated to 64bytes
-10 * 2 = 20 -> typically allocated to 32bytes
Be careful when allocating strings.
AtomicString VS String
WTF::AtomicString is a class that has four differences from the normal WTF::String class:
- It’s more expensive to create a new atomic string than a non-atomic string; doing so requires a lookup in a per-thread atomic string hash table.
- It’s very inexpensive to compare one atomic string with another. The cost is just a pointer comparison. The actual string length and data don’t need to be compared, because on any one thread no AtomicString can be equal to any other AtomicString.
- If a particular string already exists in the atomic string table, allocating another string that is equal to it does not cost any additional memory. The atomic string is shared and the cost is looking it up in the per-thread atomic string hash table and incrementing its reference count.
- There are special considerations if you want to use an atomic string on a thread other than the one it was created on since each thread has its own atomic string hash table.
We use AtomicString to make string comparisons fast and to save memory when many equal strings are likely to be allocated. For example, we use AtomicString for HTML attribute names so we can compare them quickly, and for both HTML attribute names and values since it’s common to have many identical ones and we save memory.
We shouldn't use AtomicString if the string we're about to create doesn't get shared across multiple AtomicStrings. For example, if we had used AtomicString for the strings inside Text nodes, then we may end up filling up the atomic string table with all these really long strings that don't typically appear more than once. It also slows down the hash map look up for all other atomic strings.
(this topic is a summary of the thread "[webkit-dev] When should I use AtomicString vs String?")