Arrays in PHP
In my projects and on this blog, I often explore how to push PHP beyond its traditional boundaries. Whether it's writing a vector search engine based on HNSW from scratch or rethinking ORM design, there's a recurring theme: when you start manipulating massive amounts of data, convenience clashes with performance. And in PHP, nothing is more convenient—and potentially insidious—than our beloved arrays.
PHP arrays are, all things considered, a masterpiece of flexible engineering. They aren't simple vectors; they are ordered maps, internally implemented as Hash Tables. You can use them as lists, stacks, queues, dictionaries, and trees. But this omnipotence comes with a computational price. Every time you add an element, PHP has to calculate hashes, handle collisions, and maintain the insertion order through a complex structure of pointers.
When you're managing ten, a hundred, or a thousand elements, you don't notice it. But what happens when you need to load a million vector embeddings into RAM, or large matrices for scientific computing? Your server starts sweating, and memory is quickly exhausted.
This is where an often-forgotten gem of the Standard PHP Library (SPL) comes into play: SplFixedArray.
The Genesis of SplFixedArray
To understand the value of SplFixedArray, we need to go back to the PHP 5.3 era, when SPL started becoming a core component of the language. In those years, PHP's Hash Tables were massively inefficient from a memory standpoint. Every single array element required the creation of several zvals (the C structures that define variables in PHP) and pointers scattered everywhere in RAM. This caused not only disproportionate memory consumption but also excessive fragmentation that trashed the CPU cache.
The core developers' response was SplFixedArray. The goal was to provide PHP developers with a true "C-style" data structure: a fixed-size array, allocated in a contiguous memory block, indexed exclusively by sequential integers.
No hashing functions, no insertion order tracking via linked lists. Just a pure index and a value. Back then, switching from a native array to an SplFixedArray for large data iterations meant cutting memory consumption by 50-60% and doubling execution speed.
The Impact of PHP 7
If you are reading this article today, you are probably working with PHP 8.x (perhaps with the JIT enabled, as I always recommend in my experiments). And here is where things get interesting.
With the advent of PHP 7, the Zend Engine underwent a colossal rewrite. The internal structure of arrays (zend_array) was revolutionized: buckets are now allocated in a contiguous memory block, and arrays composed purely of continuous numeric indices (so-called "packed arrays") were massively optimized, bypassing hashing logic when it's not needed.
Many developers immediately proclaimed the death of SplFixedArray. "Native arrays are now super fast, SPL is no longer needed!" you would read on forums.
But the engineering reality is more nuanced. Even though the performance gap has drastically narrowed (and in some micro-optimization cases, accessing a native array can be faster due to the lack of overhead from calling an object), SplFixedArray maintains unique advantages. It doesn't have to worry about dynamic capacity. When you fill up a native array in PHP and exceed its limit, the engine has to reallocate a memory block twice the size and copy all the elements. SplFixedArray doesn't have this implicit behavior, leaving memory allocation control up to you.
Furthermore, with PHP 8.0, SplFixedArray was modernized. It dropped the Iterator interface implementation in favor of IteratorAggregate, which means the internal foreach loop now uses the same high-performance iterator as native arrays, significantly lowering the overhead.
How to Write It
The syntax is extremely clean and object-oriented. Unlike classic arrays that grow magically, here you have to declare your intentions from the start.
// We initialize a fixed array of 100,000 elements
$size = 100000;
$fixedArray = new SplFixedArray($size);
// Assignment happens just like in a normal array
for ($i = 0; $i < $size; $i++) {
$fixedArray[$i] = $i * 2;
}
// Reading and iteration
foreach ($fixedArray as $key => $value) {
// Fast processing
}
The rigidity of SplFixedArray is its main strength. If you try to access an invalid index or use a string key, PHP will punish you immediately, throwing an exception (RuntimeException or InvalidArgumentException).
$arr = new SplFixedArray(5);
$arr['text_key'] = 'value'; // Fatal error: Uncaught TypeError
$arr[10] = 'value'; // Fatal error: Uncaught RuntimeException: Index invalid or out of range
This behavior, in a rigorous enterprise architecture, is a great advantage: it acts as a natural fail-fast mechanism that prevents the unintended creation of anomalous keys stemming from logical bugs.
Sizing and Conversions
What happens if, despite the "fixedness," you realize you need more space?
You can resize the object using the setSize() method.
$arr = new SplFixedArray(10);
$arr->setSize(20); // Increases space while keeping previous data
$arr->setSize(5); // Truncates the array, deleting elements beyond index 4
You need to be careful: calling setSize() is not painless.... Internally, the C engine will have to allocate new memory and move the data. If you find yourself calling `setSize` repeatedly inside a loop, it means you're using the wrong tool and you'd be better off relying on a normal array.
SPL also provides quick methods to move to and from the world of classic arrays:
$nativeArray = [1, 2, 3, 4, 5];
$fixed = SplFixedArray::fromArray($nativeArray, false); // the second parameter disables saving original keys for greater speed
$backToNative = $fixed->toArray();
Memory Analysis
When developing complex systems in PHP, especially when touching the boundaries of machine learning or text indexing, benchmarks become crucial.
Let's imagine having to load a million strings into memory to process them (a bit like what happens when chunking documents to generate vectors).
A native packed array of 1,000,000 integers in PHP 8.3 consumes about 32 MB of RAM.
An equivalent SplFixedArray consumes about 24 MB.
The savings of about 25% in memory is still real and tangible. If your application processes dozens of these structures simultaneously, the difference between using SplFixedArray or classic arrays could be the difference between a process throwing an Out of Memory error and one that completes the task smoothly.
In terms of foreach iteration speed, however, the native array might beat SplFixedArray by a few dozen milliseconds due to how the Zend Engine is optimized, but SplFixedArray will always win on peak allocated memory.
Best Practices
After years of use, here are my recommendations for those who want to integrate this class into their projects:
What NOT to do:
- Don't use it for small payloads: Under, say, 10,000 elements, the object instantiation cost far outweighs the benefits. For datasets of these sizes, native arrays are perfectly fine.
- Don't repeatedly use `setSize()`: Resizing the array kills performance. Always allocate the maximum space you think you'll need.
What to DO:
- As already mentioned, allocate in advance: Before initializing the array, try to determine the exact correct size. If you don't know it but have an estimated maximum cap, allocate that and keep track of valid insertions with a separate counter, rather than resizing.
- Profile before optimizing: Use profiling tools to evaluate performance. Don't blindly replace native arrays with
SplFixedArray. Do it only in bottlenecks where RAM is the limiting metric!
In Conclusion
PHP has evolved in extraordinary ways. From a language for simple templating scripts, it has become a robust engine capable of executing heavy calculations, running long-lived daemons (thanks to Swoole, RoadRunner, or FrankenPHP), and handling massive workloads.
In this modern ecosystem, going back to basics is vital. Learning to use tools like SplFixedArray reminds us that, beneath all the abstractions and modern syntax, we are still working with a machine that requires CPU and RAM.
Knowing when to use a Hash Table (the standard array) and when to opt for contiguous memory (SplFixedArray) is exactly that leap in engineering awareness that separates a script that "simply works" from software designed to scale. Let's keep writing mindful code!