Top 10 Java Performance Problems.pdf

所需积分/C币:9 2019-07-03 09:43:53 677KB PDF
收藏 收藏

Top 10 Most Common Java Performance Problems by AppDynamics
Introduction My career in oaro mance bagan, as you might guess, Wtr wreaked havoc in an app bt for the most pat pe formance issues a catastrophe. was wo king as en a chiect at a maker g in eve are a petty cookie cutter In tas eBook, I'l talk aoout omoany that runs surveys, anc o"e of our promotions took o. some o the most common oroolems I've encountered durng ny It was featured on te AOL homepage, and thousands o people time as a per ormance analyst, alog wt how to recognize and Degar vstng our te cat started as a grest success addess these issues to minimize their impact and prevent ther cuckly tured into a cris s as our apolication se s bega to fom occ,rg you application, ve sorter the most common all over unde the uc precedented oad ssues to three main categories eppl cation environmant was actua ly pretty heavy-duty or te Database Proolems Most applications o scale w tme. we ec a cisco load baance ir front of four webl ccic evetua y be bac ed by some form of elational o" non instances ru ring on So aris with a Occle c.ste behind them elational database, so for the purposes o tois OBoo We were grossly . nderoreoarec, however or tre load we were foCus on three common elational database pooler about to receive. As soon as the romotion apoearec on te persistence cor figura or (lazy vs, eager loading AC- "orrierace I watched our sess on couts stat clirmoing ron caching, erd database connecton oo co io.atior 10 to be bo, and o,0OC at wncr po nt things started headin COwnhill cuickly I wias restarting the webLogic stances as fast as Menory Problens: Java memory management is could ike a terrib e garne o wcack-a-Mole Venta y howeve halleng rg and can lead to al kinos of performance t got ot of our hancs we ad to ask AOL to remove our survey sees, cc.s on what I cave cose vec to oe the tvc Fromterromerage, because we simply coulcr't handle the trafic most common memo"y issues: gabage collection f curatie and memoy leak Wne this"appe ned I had v tually no experie ce with perornerce analyss, but I cucky ealize ow importat performance was to corcure"cy Problams As the complexity of appl cat"s eve yone i' the business Over t'le next 13 veas leaned ceases we irid O'seves writing code that per or'r CO.d aoout Eva ce tormance ard architecture so trat cou moe actions coc.rrertly In this section I oC.s" three help my company ard clients to get their apos up to soeed. Over common concurrency ss es thread deadlocks, toread 13 vears I saw my fsi of perfo rnance-related issues Ind thread pool corf curation iss es and noticed a disturbing tred: Vost oarfo mance issues ir Java an oe attrib. ted to a hend ul of root cases, Sure acces one y Steven -eines saw sone bizarre corner cases that came out o the ble anc Top Ten Most Common Java Performance Problems Database Top Ten Most Common Java Performance Problems Database The back bone of any modern web application is its data Back pool may be improperly sized, or the database itself may be in 1995, when businesses first began building web applications missing indices or otherwise in need of tuning. In this section to house their marketing content, the database wasn't such a we'll focus on three of the most common performance issues on necessary feature of most web apps- the content was static the application side of the equation (it's probably best to leave and there was virtually no user interaction. Today, however, database tuning for another e Book): applications are much more complex and perform many more functions than before. As a result the database is critical to the Excessive database queries (sometimes called the functionality and performance of the application N+1 problem Executing database queries that should be served from It should come as no surprise that the database is, therefore a cache the biggest source of performance issues for Java applications.. An improperly configured database connection poo Problems can occur in many places: your application code may access the database inefficiently, or the database connection 1. Death by 1,000 Cuts: The Database N+1 Problem Problem Back in the J2EE days when we were building Entity Beans, and specifically Bear A common problem with Java applications that access a database is that they Managed Persistence(BMP)Entity Beans, there was a problem that was referred sometimes access the database too often, resulting in long response times and to as the"N+1"problem. Let's say I wanted to retrieve the last 100 entries from unacceptable overhead on the database. Most of the time this is not deliberate. The my Order table. To do this, I 'd have to first execute a query to find the primary keys example describes the state of the database access a few years ago, but if of each item, such as you're not careful, this problem can reemerge in modern Hibernate and JPA code SELECT id FROM Order Where Hibernate and other JPA implementations provide fine-grained tuning of database access. The most common option for database tuning is the choice between And then I'd execute one query for each record eager and lazy fetching. Consider querying for a purchase order that contains 10 SELECT *FROM Order WHERE id line items. If your application is configured to use lazy fetching but the business requirement calls for the application to show all 10 line items, then the application In other words, it would take me 101 queries to retrieve 100 records (N+1). Wed will be required to execute an additional 10 database calls to load that purchase call this problem "death by 1,000 cuts"because even though each individual order. If the application is configured to use eager fetching, however, depending on query was fast, I was executing so many of them that the cumulative response the persistence technology, you may have one or two additional queries, but not 10 time of the transaction was huge, and the database was subjected to enormous overhead Persistence technologies have improved since then, but the "N+1 problem hasnt completely disappeared, and neither has " death by 1, 000 cuts. It's important to understand the implications of configuration options on both your database and your persistence engine Top Ten Most Common Java Performance Problems fg 3+ Database cont'd news is that once you ve diagnosed this problem, fixing it is usually pretty When an application requests an objecl Irom the database, thal objecl may reference traightforward Just be careful when making changes to your persistence engine other objects. For example, a Purchase Order object may reference mulliple LinelleImt configurations, because ditterent use cases will have different consequences for objects. Eager fetching means that when the Purchase Order object is requested, all application performance referenced Lineltem objects will be retrieved from the database at that time. Lazy fetching means that when the PurchaseOrder object is requestec, the referenced Troubleshooting Lineltems will not be retrieved from the database but when a lineltem is accessed. it will be loaded from the database. Eager fetching reduces the number of calls that are After observing the symptoms of this problem, troubleshooting the root cause made to the database, but those calls are more complex and slower to execute and can be challenging. In order to be able to effectively troubleshoot an N+1 problem they load more dala inlo memory. Lazy etching increases the number of calls thal are you need the following information made lo the database, bul each individual call is simple and last and il reduces the memory requirerment lo only those objects your applicalion aclually uses Counters for the number of database calls Counters for the number of executed business transactions A correlation between a business transaction and the number of database calls it's maki So which strategy should you use? The answer, of course, depends on your use An understanding of the business rules governing the business transaction case. If you have no intention of looking at the line items 99% of the time then lazy that are exhibiting the problem (you don't want to fix a performance problem fetching is the correct strategy because querying for a single line item is much faster just to learn that you just broke a business rule) han all 10. But if 90% of the time you're going to look at the line then eager fetching is your friend, because a couple slower queries are better than a hundred fast ones The business transaction and database counters can be used to confirm that you The point is that you need to understand how your application will be used before you have this problem, but the most important piece of information is the correlation configure your persistence engine between business transactions and the database calls they make One warning to be aware of is that the nature of this problem is in the persistence logic, so Symptoms you may very well have multiple business transactions exhibiting this problem If you can correlate business transactions to persistence logic then you ll be in a The primary symptom of the N+1 problem is an increased load on, and therefore very good place to properly remediate the problem a slower response time from, the database. The problem will be hard to detect under low user load because the database will have plenty of processing power, but as application load increases it will become more and more problematic. If Avoiding this problem the application load and database load increase at the same rate then you are The best way to avoid this problem is to understand both your business domain probably making good use of your database, but if an increase in load on the and the persistence technology you're using. Making sure you're solving the application results in a disproportionate increase in load on the database then correct business problem is up to you, but here's some general advice about the you might have an N+1 problem technology side of things Do not accept the defaults without understanding what they are. In the case mpact of eager versus lazy loading, most persistence engines default to "lazy, which may or may not be appropriate for your application. If t, you may be 6AAA△AA△A△△ inflicting unnecessary load of your database Understand tuning options in your persistence tier. Eager vs. lazy loading s one example, but there are other options that can cause issues. To truly only rate the impact of this problem as a 6 because, even though it's fairly be eftective, you need to pair up the configuration options of your chosen common, you have an easy mitigation strategy: simply increase the capacity of your database. It it's not a long-term solution, but it ' ll do the job. The good persistence technology with the business problem you're trying to solve Top Ten Most Common Java Performance Problems Database cont'd 2. Credit and debit only, No cache The Importance of caching Stateful objects represent specific object instances, such as a specific Purchaseorder ific child. Stateless abjects cal objects, such as a Phillips head Years ago, Marc Fleury, the creator of JBoss, wrote a paper called"Why I Love screwdriver or a supermarket checker. When a slaleful object is accessed il is imporlant EJBs. In it he argues that it's faster to read data from an entity bean in memory lo retrieve a specific objecl, but when a stateless objecl is accessed, any objecl of thal lype than it is to make a database call across a network. while I'm not as in love with ill d EJBs as Marc, I can't deny that database calls can be very expensive from a performance standpoint. This is why, in recent years, many organizations have turned to caching to optimize the performance of their applications -it's much faster to read data from an in-memory cache than to make a database call across Because caches are stateful, you must configure them to a finite size so as not to a network exhaust memory. When the cache is full, then the cache must respond based on its configuration. For example, it might remove the least recently used object from the Problems with Caching cache to make room for the new object. This means that sometimes the requested object may no longer be in the cache, resulting in a"miss a miss typically results in 1. No cache. It doesn't take a degree in rocket science to understand that it's faster a database call to find the requested object. The higher your miss ratio, therefore, the ess you're taking advantage of the performance benefits of the cache It's important to serve content from memory than to make a network trip to a database that has to to optimize your cache settings carefully, so that you maintain a good"hit ratio execute a query to retrieve your data. Unless you have specific reasons not to cache you should be caching without exhausting all the memory in your JVM 2. The cache is not configured properly. There are various levels and various 3. Distributed caching. If you have multiple servers in a tier all writing to their implementations of caching, from a level 2 cache that sits between your persistence own caches, how do they stay in sync? If you do not configure the caches to be engine and your database to a stand-alone distributed cache that holds arbitrar distributed, then they wont. Depending on which server you hit, your results may business objects. Your persistence technology, such as Hibernate, should have vary(which is usually a bad thing). Most modern caches support a distributed support for a level 2 cache that behaves as follows: when a request for an object is paradigm so that when a cache is updated it will propagate its changes to other made, first check the cache to see if the object is already in memory; if it is and it members in the cache. But depending on the number of cached nodes and the hasnt expired, then return that cached object, otherwise make the database call, save consistency of data you require, this can be expensive Consistency refers to the the object to the cache, and return the object to the caller. In this capacity, frequently integrity of your data at a point in time: if one cache node has one value for an object and another node has a different value then the two cache nodes are said used objects will be resolved without requiring interaction with a database to be inconsistent. On the loose end of the spectrum, caches can be eventually Caches are not the be-all end-all solution, but once you have decided to use a cache consistent, meaning that your application can tolerate short periods of time when the caches on different nodes do not have the same values. For example, if you there are a few things you need to consider post a new status on Facebook you'll see it immediately, but your friends won't Caches are a fixed size see it for a couple minutes. On the other end of the spectrum, you might require al Distributed caching is a non-trivial problem cache nodes to have the same value before that update is considered committed The performance challenge is to balance your distributed caching behavior with Caches hold stateful objects, unlike pools, which hold stateless objects. For your business rules: try to opt for the loosest distribution strategy that satisfies example, imagine a pool as the registers at a supermarket. When you re ready to your business requirements check out, you go to whichever register is free it doesn't matter which one you get. Caches, on the other hand, are like children at daycare. When you go to the daycare to pick up your child, you don't just pick up whichever child is available first you're only interested in picking up your own child. Pools contain stateless objects Cache consistency. sometimes referred to as cache coherence refers to the validity of data meaning it doesn't matter which connection you get -all connections are equal across your entire cache. A cache is consistent if every instance of the same object in the but caches contain stateful objects because you go to a cache looking for a specific cache has the same value. In recent Limes, Large-scale applications have adcpled eventual piece of data consistency, which means thal there will be periods of time when different instances of the sane object will have different values, bul evenlua lly they will have the same value Top Ten Most Common Java Performance Problems Database cont'd Symptoms cache hit count or hit ratio and the cache miss count or miss ratio. A cache hit The main symptom of an application that is not using a cache properly is increased cache did not service the request. If you observe a high miss count then o means that the cache serviced the request and a cache miss means that th database load and slow response times as a result of that load. Unlike the N+1 cache is sized too small performance problem, the relative database load increases in direct proportion to your application load. If you're using caching correctly, database load should not increase in proportion to application load, because the majority of requests Avoiding this problem should be hitting the cache. The negative impact of this problem is that as your Plan, plan, plan! Whenever I develop an application, no matter how small, I always load increases, the database machine is subjected to more load, which increases err on the side of performance and scalability. This is not to say that you should its CPu overhead and potentially disk l/o rate, which degrades the overall go to extraordinary lengths to overengineer your application. It just means that performance of all business transactions that interact with the database before you begin, you should ask yourself the question: if I face performance Issues, is this an object that could be cached? If so, go ahead and build the object Impact in such a way that it can be easily added to a cache later, by doing things like making the object serializable 7 A AA A△△△△△ you're building an application of substance then I would recommend plementing caching anywhere it seems appropriate. Youll avoid many problems further down the line for a relatively small investment. I only rate a missing cache as a 7 because adding a cache is really a performance enhancement, not a necessity. If you see increased load in your database and degraded performance, one of the biggest enhancements you can make for Java celines the notion ol serialization as lollows: an objecl is serializable if il implements your application is adding a cache. It would be best to plan for a cache from the beginning, but caches can be added after the fact with minimal code the java. io Serializable interlace and il only contains primitive types, Strings, and other serializable objects. Praclically. serializable objects can be converted inlo a form thal can rewrites. Depending on where you insert your cache and your cache providers be transported lo and from olher servers or even lo disk Mosl caching solutions leverage requirements, the code impact may vary, but it's rarely significant Java s support for serialization to send objects from one machine to another Troubleshooting Identifying the need for a cache is accomplished by examining the performance of your database, its resource usage, and the amount of load that your application is sending to the database. If you observe problems with your databases resource utilization then you should examine your business requirements and determine whether or not a cache would be a good option Once you have determined that you need a cache, sizing the cache appropriately is the next issue. A cache adds the most value if the cache can service the majority of queries made to the cache, meaning that it contains the majority of the most frequently accessed objects. If the cache is sized too small then a significant number of queries will require a call to the backend data store because the cache doesn't contain the value. If the cache is sized too large then it could consume an excessive amount of memory. Caches frequently publish metrics. such as through JMX, about their performance. Two common metrics are the Top Ten Most Common Java Performance Problems Database cont'd 3. Does anyone have a connection I can borrow? Symptoms Database connection pools The main symptoms of a database connection pool that is sized too small are increased response time across multiple business transactions, with the majority In the last chapter we compared pools to the registers or checkers in a supermarket of those business transactions waiting on a DatasourcegetConnection() call, in A set number of checkers are open at any given time, and the shopper doesnt conjunction with low resource utilization on the database machine. At first glance care which checker they get, then they choose the first available checker so they this will look like a database problem, but the low resource utilization reveals that can get out of the store as soon as possible the database is, in fact, under-utilized, which means the bottleneck is occurring in the application To take this analogy a step further, imagine now that only two registers are open during a busy time at the supermarket. What happens? If you've never experienced The symptoms of a database connection pool that is sized too large are increased this (lucky you) then you can probably imagine that there would be a long line response time across multiple business transactions, with the majority of those of angry customers. This is analogous to what happens when your database business transactions waiting on the response from queries, and high resource connection pool is too small. The number of connections to your database controls utilization in the database machine how many concurrent queries can be executed against it. If there are too few connections in the pool then you'l have a bottleneck in your application, increasing So while the external symptoms between these two conditions are the same, the response times and angering end users (who, like Inigo Montoya, hate waiting internal symptoms are the opposite. In order to correctly identify the problem, you need to find out where your application is waiting on the database (for a connection to Problem the database or on the execution of a query) and what the health of the database is Database connections are pooled for several reasons Impact Database connections are relatively expensive to create, so rather than create them on the fly we opt to create them beforehand and use them whenever we need to access the database The database is a shared resource so it makes sense to create a pool of 8AAA△△AAA△△ connections and share them across all business transactions The database connection pool limits the amount of load that you can send to The impact of a misconfigured database connection pool rates an 8 on my scale your database. because the performance impact will be observable by your users. The fix is simple but will require time and effort: use load testing and a performance analysis tool to The first two points make sense because we want to pre-create expensive resources find the optimal value for the size of your database connection pool, and then make and share them across our application. The last point, however, might seem counter- the configuration change intuitive. We pool connections to reduce load on the database because otherwise we might saturate the database with too much load and bring it to a screeching halt The point is that not only do you want to pool your connections, but you also need to configure the size of the pool correctly If you do not have enough connections, then business transactions will be forced to wait for a connection to become available before they can continue processing. If you have too many connections, however, then you might be sending too much load to the database and then all business transactions across all application servers will suffer from slow database performance. The trick is finding the middle ground Top Ten Most Common Java Performance Problems Database cont'd Troubleshooting dentifying that you truly have a database connection pool configuration problem requires insight into what your application is doing If your application iting on calls like Datasource getConnection( and your database is underutilized then your connection pool is too small If your application is waiting on database query executions, such as PreparedStatement execute() and the database is over-utilized then your connection pool is too large (or your database and your queries need to be tuned! Avoiding this problem Database connection pool problems are really a combination of connection pool size tuning, SQL query tuning, and database tuning. If your queries are optimized and your database is properly tuned then your database can support more load than if your queries are sloppily written and the database is not tuned. Therefore recommend the following approach 1. Tune your SQL queries, either manually using your favorite SQL tuning book as a guide or automatically using a tool like SQL Optimizer. Ensure that your sQL is top-notch (and this includes your HoL and EJBQL as welD) 2. Estimate the relative balance between the various queries your application will be executing (determined by your estimation of business transaction balance and your understanding of the queries executed by each business transaction) 3. Execute a load test against our database and tune your database to optimally support these qu 4. Load test your application in a production-like environment (same number and same class of machine if possible). Run multiple itera th different database connection pool settings and choose the best fit for your pplication. You want to ensure that you do not saturate the d if you can find the number of connections just below that saturation point then you have your golden valu Top Ten Most Common Java Performance Problems

试读 29P Top 10 Java Performance Problems.pdf
立即下载 低至0.43元/次 身份认证VIP会员低至7折
关注 私信
Top 10 Java Performance Problems.pdf 9积分/C币 立即下载
Top 10 Java Performance Problems.pdf第1页
Top 10 Java Performance Problems.pdf第2页
Top 10 Java Performance Problems.pdf第3页
Top 10 Java Performance Problems.pdf第4页
Top 10 Java Performance Problems.pdf第5页
Top 10 Java Performance Problems.pdf第6页

试读结束, 可继续读3页

9积分/C币 立即下载 >