[ Team LiB ] Previous Section Next Section

28.3 Optimize the Slowest Parts

Although there are other motivations, such as personal satisfaction, most people optimize a program to save money. Don't lose sight of this as you spend time increasing the performance of your programs. There's no sense in spending more time optimizing than the optimization itself saves. Optimizing an application used by many people is usually worth the time, especially if you benefit from licensing fees. It's hard to judge the value of an open-source application you optimize, but I find work on open-source projects satisfying as recreation.

To make the most of your time, try to optimize the slowest parts of your program where you stand to gain the most. Generally, you should try to improve algorithms by finding faster alternatives. Computer scientists use a special notation called big-O notation to describe the relative efficiency of an algorithm. An algorithm that must examine each input datum once is O(n). An algorithm that must examine each element twice is still called O(n), as linear factors are not interesting. A really slow algorithm might be O(n^2), or O of n-squared. A really fast algorithm might be O(n log n), or n times the logarithm of n. This subject is far too complex to cover here. You can find detailed discussions of this topic on the Internet and in university courses. Understanding it may help you choose faster algorithms.

Permanent storage, such as a hard disk, is much slower than volatile storage, such as RAM. Operating systems compensate somewhat by caching disk blocks to system memory, but you can't keep your entire system in RAM. Parts of your program that use permanent storage are good candidates for optimization.

If you are using data stored in files, consider using a relational database instead. Database servers can do a better job of caching data than the operating system because they view the data with a finer granularity. Database servers may also cache open files, saving you the overhead in opening and closing files.

Alternatively, you can try caching data within your own program, but consider the lifecycle of a PHP script execution. At the end of the request, PHP frees all memory. If during your program you need to refer to the same file many times, you may increase performance by reading the file into a variable.

Consider optimizing your database queries too. MySQL includes the EXPLAIN statement, which returns information about how the join engine uses indexes. MySQL's online manual includes information about the process of optimizing queries.

Here are two tips for loops. If the number of iterations in a loop is low, you might get some performance gain from replacing the loop with a number of statements. For example, consider a for loop that sets 10 values in an array. You can replace the loop with 10 statements, which is a duplication of code but may execute slightly faster.

Also, don't recompute values inside a loop. If you use the size of an array in the loop, use a variable to store the size before you enter the loop instead of calling count each time through the loop. Likewise, look for parts of mathematical expressions that factor into constant values.

Function calls carry a high overhead. You can get a bump in performance if you eliminate a function. Compiled languages, such as C and Java, have the luxury of replacing function calls with inline code. You should avoid functions that you call only once. One technique for readable code is to use functions to hide details. This technique is expensive in PHP.

If all else fails, you have the option of moving part of your code into C, wrapping it in a PHP function. This technique is not for the novice, but many of PHP's functions began as optimizations. Consider the in_array function. You can test for the presence of the value in an array by looping through it, but the function written in C is much faster.

    [ Team LiB ] Previous Section Next Section