by Oliver
28. November 2017 10:58
Disclaimer: This is just a bunch of short notes I took during a conference session held by Daniel Marbach at the .NET Developer Days. To dive deeper into the topic, please head over to the following GitHub repo, run the examples, and examine the source code: https://github.com/danielmarbach/Await.HeadExplosion/tree/master/Presentation. My take aways: use Task.Run() for compute-bound operations, avoid the low-level method Task.Factory.StartNew() in the async world, use await Task.Delay() instead of Thread.Sleep() since .NET Framework 4.6, there's a cached static instance Task.CompletedTask that can be used whereever you want to return a completed task – no allocation needed opt out of context capturing by calling .ConfigureAwait(false) – when the previous execution context is not needed for the continuation of the async task concurrency != parallelism: a lot of tasks can be handled by a single thread-pool thread concurrently – only one thread will be used tasks can be split across many threads to achieve parallelism – when desired/necessary use SemaphoreSlim to limit parallelism, e.g. to use only 3 threads at the most use ValueTask<T> for high performance code – wrapper for situation where either a Task is supposed to be executed or its return value is already available, e.g. cached Happy asyncing!
by Oliver
8. November 2017 09:00
This is a digitalized version of my session notes from the conference. Key Questions How fast is it? How fast could it be? How fast should it be? Always measure your gains – which implies you need to measure your current performance. Allocations Cost – Sometimes A Lot Here's an example from Hyperion (fork of Wire): original code – allocates a new byte array on every call optimized code – reuses a byte array from a pool What are the gains of this small change? Optimizations like this one pay off especially in low-level code or inside of libraries that will be consumed by third parties. As always – first measure, then optimize. Tools for the performance minded BenchmarkDotNet: a .NET benchmarking framework marten: async document database and/or event store Hyperion: a high performance polymorphic serializer for the .NET framework, built for Akka.NET Jil: Fast .NET JSON (De)Serializer, Built On Sigil protobuf-net: Protocol Buffers library for idiomatic .NET Microsoft Bond: cross-platform framework for cross-language de/serialization with powerful generic mechanisms (this is not your go-to tool when you just want to de/serialize some data quickly ;-) - it's a whole framework) Have fun and stay focused!
by Oliver
26. October 2017 10:00
Although the choice of the best session was not easy, it has to be awarded to Sasha Goldstein with his session on performance detective work. His session was very well prepared, had a clear goal and path, was packed with insights into challenges and problems during performance investigation work, and offered hand-crafted solutions like his own realtime ETW event tracing tool etrace. What follows is an excerpt of Sasha Goldshtein's presentation. Structure of a Performance Investigation Obtain the problem description Build a system diagram Run a quick performance checklist Understand which component is exhibiting the problem Investigate thoroughly Find the root cause Resolve the issue Verify resolution Conduct and document post-mortem Performance Metrics, Goals, Monitoring Performance metrics don’t live in a vacuum! Derive performance metrics from business goals Monitor these metrics in your APM solution, home-made dashboard, or collection script, and get alerts Investigation Anti-Methods Make assumptions Trust “instincts” and irrational beliefs Look under the street light Use random tools Blame the tools The USE Method USE: Utilization, Saturation, Errors Build a functional diagram of the system, including hardware/software resources For each resource, identify utilization, saturation, and errors Understand, resolve, and verify errors, excessive saturation/utilization, under-utilization Statstics Lie Be Careful With Statistics Averages are meaningless Medians are almost meaningless Percentiles are OK if you know what you’re doing Find good visualizations for your performance data Beware coordinated omission Look at histograms or sometimes even percentile plots aka cumulative distribution charts to really understand your data, e.g. your performance traces. Just look at this dinosaur to understand that very differently shaped data can lead to the same statistics values: Conduct a Postmortem – Do It! Document the steps taken to identify, diagnose, resolve, and verify the problem Which tools did you use? Can they be improved? Where were the bottlenecks in your investigation? Can you add monitoring for sysadmins/ops? Can you add instrumentation for investigators? How do we triage this problem automatically next time it happens? Resources etrace: a real-time, command-line frontend for ETW events –> https://github.com/goldshtn/etrace LiveStacks: a real-time, command-line stack collector and resolver –> https://github.com/goldshtn/LiveStacks BenchmarkDotNet is a powerful .NET library for benchmarking: http://benchmarkdotnet.org/ HdrHistogram: A High Dynamic Range (HDR) Histogram –> https://github.com/HdrHistogram/HdrHistogram Slides: How NOT to Measure Latency: https://www.azul.com/files/HowNotToMeasureLatency_LLSummit_NYC_12Nov2013.pdf Blog post: Windows Process Memory Usage Demystified Course: Statistics for Engineers – https://github.com/HeinrichHartmann/Statistics-for-Engineers (use http://nbviewer.jupyter.org/ to view the ipynb files) And now, happy performance hunting!
by Oliver
17. November 2015 23:11
We've recently been doing some optimization work on Camping.info to improve user experience through faster web site load times. This post goes into the details of the optimization steps and their effect on the site's performance. Measure, Improve, Measure again To be confident that the changes we will introduce to Camping.info actually improve the performance or perceived performance of the site, we set up an automated test harness on TeamCity using the webpagetest API wrapper node module and a custom powershell wrapper script around that which collects the test results and reports them to TeamCity. The following paragraphs will go into some detail on the concrete steps we took to improve our users's experience. Include external script in existing bundle As described in Avoid Blocking Requests on External Domains, we chose to include the Bugsnag javascript library in our already existing script bundle. This saves one request and one DNS lookup. Here's a look at the start page performance before and after: The savings are humble but noticeable – the Time To Start Render drops from >1200 ms to 1100-1200 ms, which in practice will correlate with a slightly faster page appearance. Host jQuery on your own server – or don't Based on the previous improvement I assumed that saving a DNS lookup alone could already help in improving perceived performance. So for loading jQuery we switched from cdnjs.cloudflare.com to our own domain. It turns out though that this didn't have any impact on rendering or load times. This is actually a tricky optimization – it depends a lot on who your audience is and what sites they visit. Loading e.g. jQuery from an external host will either save one request because the client's browser might have that resource cached after visiting a totally unrelated site that includes the same library, or your user will pay for an extra DNS lookup as compared to just loading the library from your own server. The decision is up to you. Load external javascript after window.load A large block of potential optimization on Camping.info actually concerns the loading, parsing, and execution of javascript. Due to the organic growth of the site over the last 8 years, deferring javascript execution to a point in time after the page has actually rendered turns out to be a complex issue. We still have plenty of pre-loaded or inline javascript blocks, which is mostly due to the way ASP.NET WebForms and its UpdatePanels work. The only easy solution to dependency managament for all of those code blocks was to simply load all dependencies before the HTML that refers to them. This pattern, unfortunately, has lead to one large script bundle that we load in the <head> section of the page because loading it later would mean breaking all inline script block execution. Fixing this will require significant refactoring and thorough testing. But there still is room for improvement! All external javascript can safely be loaded after the page has rendered. This includes Facebook buttons, the AddThis widget, and advertisment scripts. We already had most of these scripts loading after the window onload event, but additionally deferring loading of connect.facebook.net/en_US/fbds.js showed the following improvement on the start page: Now, while the render start time did not decrease, the page load time decreased from around 1.8s to 1.5s. Thi is definitely a decent improvement but please don't overrate it – most of the page's content had probably already been loaded even in the old version. But now we can at least be sure that all Facebook assets will definitely be loaded only after all of the page's own assets have been loaded. And that's good. It turns out that on a different page, the improvement after this single change is actually even more significant: Here we can see that the deferred loading of the Facebook script actually improves not only the page load time, but also the start render and DOM content ready times. One script is still being loaded before the onload event – Google Analytics. I couldn't quite convince myself to defer its loading until after onload, because we use it to track some user metrics and timings, and I felt that GA might not report the same quality of results if loaded too late. Please leave your opinions on this topic in the comment section. Specify image dimensions inline to speed up rendering The worst grade in our PageSpeed score was actually for not specifying image dimensions, neither in HTML nor in CSS: So we went ahead and did that for the start page. Here's how that improved our score: I honestly cannot tell any difference in performance with image dimensions provided. There are several possible causes for this: maybe the images in the above-the-fold content are loaded fast enough to not delay page rendering maybe the page's CSS allows the browser to start rendering even without knowing the exact image dimensions something that I have no clue about at the moment. Loading CSS file from same domain To speed up rendering it also seemed to be a good idea to deliver our site's CSS file from the same domain as the HTML, thus saving a DNS lookup during the early stage of page rendering. Actually, the start render time dropped a bit by doing that but unfortunately the page load time increased a bit indeterministically: It's safe to assume that the additional load time was caused by the fact that all image resources that are referenced in our CSS were now also being retrieved from the main domain instead of the cookieless one which in turn delayed loading of other image resources. For now we reverted this change, but we know that we can further optimize the render process by serving out CSS even faster. It would probably also help a lot if we split our large CSS file into smaller ones that could be loaded per page. Changes without performance impact Wrapping inline javascript blocks in $().ready() Todos for the next performance sprint defer loading of as many javascript files as possible to after the onload event combine and minify ASP.NET AJAX's ScriptResource.axd and WebResource.axd files load CSS from page domain but referenced images from cookieless domain (try css-url-rewrite) load less CSS per page – ideally inline the CSS needed for the above-the-fold content use HTML and CSS instead of images for our Google map buttons – this will save a ton of requests on the search page Where are we at now? Happy performance tuning!
by Oliver
14. November 2015 21:40
Get your own WebPageTest server and test agent up and running in minutes, not hours! Motivation The original documentation can be found here and here. Unfortunately, it's a bit vague in some parts, especially if you don't set up infrastructural pieces and cloud server instances on a daily basis. So here's a how-to guide to get you up and running as fast as possible. Infrastructure overview To run web page tests against your own private instance, we need: a WebPageTest server instance (the master) one ore more WebPageTest test agents (the clients) The master receives test jobs and delegates them to one of the clients. You can run tests from the web interface or using the API through the webpagetest node module. You might want to think about where in the world you want to spin up those virtual machines. The WPT server (master) can really be hosted anywhere you want but the test agents (client) location should be chosen conciously because their distance to the tested site's server will play a role in the results you will see later during testing. How to set up the master (WPT server) You need an Amazon AWS account to set this up quickly. If you haven't got one, you either quit here and set up your own server with the WebPageTest stack, or you go and create one. Now, go to your AWS dashboard, to Instances –> Instances, and click "Launch Instance": On the next screen, go to Community AMIs, enter one of the ids that can be found here – I chose ami-22cefd3f (eu-central-1) – and hit "Select": In step 2, you can choose a t2.micro instance. The WPT server does not need to be high performance – it only delegates test execution and gathers the results. It's when setting up the client (test agent), that we'll have to pay attention to the performance of the instance. Now, you keep clicking Next until you reach "Step 6. Configure Security Group". Here we need to add a firewall rule that allows us to access our WPT server (master) through HTTP, otherwise no testing will be possible. Giving the security group a more descriptive name and description (❸) is optional but nice: In step 7, review your settings if you want, then hit "Launch": As highlighted in the screen above, AWS will now want to assign a (ssh) key pair to this instance. In case you have an existing key pair you can re-use that. In case you're doing this for the first time, you won't have any existing key pairs to choose from and will have to create a new one. The "Launch Instances" button will activate only after you've downloaded your private key (❸): Clicking ❷ gets you to the Instances overview that was empty at the beginning where you'll find the public IP address and DNS entry of your instance: Congratulations, you've successfully completed the setup of the WPT server (master)! If you now open http://your.instance.ip you should see the WebPageTest UI: To log into your instance via SSH follow one of the guides here. In short: Either use ssh from the command line, available on all linuxes and even on Windows if you have Git installed, e.g. in C:\Program Files\Git\usr\bin: ssh -i wpt-server.pem ubuntu@[public-ip|public-dns] Or, on Windows, use PuTTY. In this case you'll first have to generate a PuTTY compatible private key file from your *.pem file and then you can connect through PuTTy. Here's how to do that. How to set up the client (WPT test agent) Now, we need to set up at least one test agent to actually execute some tests. There's a long list of pre-configured, regularly updated Windows AMIs with all software installed that's needed to execute tests in the documentation. To get started quickly, pick one that contains all major browsers and is located in your favorite region. In this guide, we're going to use ami-54291f49 (IE11/Chrome/Firefox/Safari) in region "eu-central (Frankfurt)". Basically, we repeat the steps from the master setup, but now using the test agent AMI. In step 2, when choosing an Instance Type, we'll now have to ensure that our agent will deliver consistent results. This performance review recommends the following choices (prices will vary by region, the ones displayed here were for US East N. Virginia), quoting: If you’re running just a couple tests per hour, on small HTTP sites, a t2.micro will be okay ($13/month) If you’re running just a couple tests per hour, on large or secure sites, you’ll need to use a t2.medium ($52/month) If you’re running lots of tests per hour, you can’t use t2’s – the most efficient agent will be a c3.large ($135/month) In step 3, we have to configure our test agent with the following information: where to find the WPT server (master): use the public IP address or DNS name what's the location (name) of this agent: a string used in the locations.ini of the master To be honest, I haven't quite wrapped my head around the auto-scaling feature of WPT. That's why we set up a single location ("first") manually that this client will be identified with. In the user data field under Advanced Details we enter: wpt_server=52.29.your.ip wpt_location=first Now, either click your way through the remaining steps or jump to "Review and Launch" and launch your test agent instance. The key pair dialog will pop up again, and now you can choose your existing key "wpt-server" to assign to that instance. You won't use it to connect to it, anyway, because the default connection type to a Windows instance is RDP for which a firewall rule was automatically added in step 6. After launching, a link will be available with instructions on how to connect to that Windows instance, but you shouldn't need to do that. Connecting master and client One step is left: we have to configure the master to know which test agents it can use. This part was actually one of the most tedious bits in the setup because juggling several configuration files with lots of options and entries to make them do what you want them to do is never easy. For the manual management of test agents we need to do the following: Log into the master, e.g. ssh -i wpt-server.pem ubuntu@pu.bl.ic.ip Go to the folder /var/www/webpagetest/www/settings/ Edit locations.ini to contain these blocks (sudo nano locations.ini): [locations]
1=first
default=first
[first]
1=first_wptdriver
2=first_ie
label="My first agent"
default=first_wptdriver
[first_wptdriver]
browser=Chrome,Firefox,Safari
[first_ie]
browser=IE 11
In settings.ini, at the very end, set ec2_locations=0 to hide the predefined EC2 locations from the location dropdown in the browser.
Restart NGNIX: sudo service nginx restart
Now, if you go to http://your.public.ip again, you'll see "My first agent" in the location dropdown and "Chrome", "Firefox", and "Safari (Windows)" in the browser dropdown. I didn't try to find out how to show "IE 11" as well, but at this moment I didn't care. (You might have to wait a few moments before the location lists update after the NGINX restart.)
You can now run your first test!
After some 10-15 seconds you should see this screen:
And a few moments later the first results should show. Congratulations!
Automating tests via the WebPageTest API
If you've tried to run WebPageTests in an automated way, you'll without a doubt have found the webpagetest node module. With your private server and test agent set up, you'll now need to dispatch your tests like so:
webpagetest test http://my.example.com ↵
--server http://<master_ip> ↵
--key <api_key> ↵
--location first_wptdriver:Chrome
The location argument refers to the definitions in the locations.ini file. The api key can be found in the keys.ini file on the master:
We run our test from within TeamCity using a custom script, but that's a topic for another post!
Happy WebPageTesting!
by Oliver
6. November 2015 21:32
This week we started to look into the page load performance at Camping.Info as well as our discoverize portals. After some initial testing and measuring, we came up with a list of action that should all speed up the user perceived page load times. The Problem Today we'll take a look at this one request: http://d2wy8f7a9ursnm.cloudfront.net/bugsnag-2.min.js. For your info, Bugsnag is an exception tracing and management solution that I can seriously recommend to have a look at. Anyway, in their docs the Bugsnag team suggests this: Include bugsnag.js from our CDN in the <head> tag of your website, before any other <script> tags. That's what we initially did. It turns out, though, that the request for the bugsnag javascript library is quite costly, especially looking at the DNS lookup time of 265ms. Here's a screenshot from a waterfall chart by GTmetrix: That's over half a second for a script of less than 3kB in size! If you have a look at the request for WebResource.axd?d= three lines below, you'll see that that resource was loaded faster than the DNS lookup for bugsnag took. Improve, Improve So let's just load the bugsnag library from our own server and save that longish DNS lookup. But, wait, we can even do better than this! We already load a bunch of javascript files as a bundle inside master_CD8711… (using the great SquishIt library, by the way) so we'll just prepend a copy of bugsnag to that bundle and save a whole request altogether! Now, that's great. And that's exactly what the crew at Bugsnag recommends for advanced usages: If you'd like to avoid an extra blocking request, you can include the javascript in your asset compilation process so that it is inlined into your existing script files. The only thing to be sure of is that Bugsnag is included before your onload handlers run. This is so that we can report stacktraces reliably. Disclaimer There's one drawback to this solution: you might not get the latest and greatest bits from Bugsnag hosting your own version. I've quickly brainstormed how to fix this issue and one way to guarantee a fresh (enough) version would be to check for the current version during your deployment process on your continuous integration server and throw an error if it's newer than the one that currently resides in our project Also, this is just one of several fixes to noticeably improve page load performance – we have bigger fish to catch looking at the bars up there! Now, let's get back to performance tuning! Oliver
by Oliver
25. April 2015 22:16
Two days ago I finally did it: I asked a question on serverfault.com looking for advice on why our brand new server performs more poorly than our two older servers. All the hardware details speak in favor of the new server: CPU: Core i7-4770 @ 3.4 GHz vs. Xeon E3-1230 @ 3.2 GHz RAM: 32 GB vs. 16 GB Drives: 2x SSD vs. 2x SATA But in reality, the older servers with the lower spec outperformed the new server by almost a factor of two! That is to say, for every 1 Request/sec processed the new server needed 4.5 % Processor time compared to 2.6 % on the old server. Here's a PerfMon screenshot of the new server: New CPUs are really good at saving energy… … actually so good, that they will rarely bother to hurry up until you really, really stress them out. Here's a good read by Brent Ozar on an energy serving CPU that would cause certain SQL queries to run two times slower on newer hardware than on the old one! That's exactly what's been happening to us. Power Plan: From Balanced to High Performance That brings us to: Power Plans. Windows Server and Client OSes come installed with several Power Plans, and it just so happened that the new server we had ordered with Windows Server 2012 R2 installed had its Power Plan set to Balanced (Recommended). Well, that might be a good choice for the server hoster as it helps keep the electricity bills down but it's absolutely not a good choice if you want your applications to perform well on that server. They will simply be a lot slower than the could be. So, open the Power Options window by typing "Power Plan" into the start menu or Windows search and check the High Performance radio button: After doins so on that new server, PerfMon would show this much more soothing picture: Now, we have only 1,5% Processing Time per 1 Request/sec processed. That's an improvement of factor 3. Nice! Lesson Learned I've learned that I'm not that good of a sys admin, yet. I had been contemplating on the reasons of the poor performance of that new server of ours again and again, I had checked all kinds of settings inside IIS, ASP.NET, and the like. Those are the areas I work in day-to-day. Turns out, I needed to widen my horizon. Thanks to serverfault.com I did. And our server is at last performing as it should. Happy administrating!
by Oliver
28. May 2014 12:09
Recently, we had to make some space available in one of our SQL Express instances that was getting close to its 10 GB limit of stored data, so I set out to delete some old data from two of our largest tables. One contained about half a million rows, the other a bit over 21 million. Simple Deletion Would Take… Forever The simplest sql statement to delete all rows that were created before 2012 would be the following: DELETE FROM [dbo].[Message] WHERE DateCreated < '20120101' I can't even tell you how long this took because at 14 minutes I just cancelled the query execution (which took another 7 minutes to finish). This was the table with less than 500,000 rows where we wanted to delete a bit more than 200,000 rows. Break Delete Operation Into Chunks Searching for a solution to the problem, I came across this blog post on breaking large delete operations into chunks. It shows in good detail how the simple version above behaves against running a loop of a few tens of thousand deletes per iteration. An interesting aspect I hadn't thought of at that point was the transaction log growth that can become a problem with large delete operations. Running a loop allows you to do a log backup (in full recovery mode) or a checkpoint (in simple mode) at the end of each iteration so that the log will grow much more slowly. Unfortunately, though, this didn't help with the execution time of the delete itself, as you can also see from the graphs presented in above post. Disable Those Indexes! It turns out, our [Message] table had six non-clustered indexes on them which all had to be written to for every row that was deleted. Even if those operations are fast, their processing time will add up over a few hundred thousand iterations. So let's turn them off! In fact, let's turn only those off that won't be used during out delete query. [We have one index on the DateCreated column which will be helpful during execution.] This stackoverflow answer shows how to create some dynamic SQL to disable all non-clustered indexex in a database. I've modified it slightly to disable only indexes of a given table: Disable/Enable Table Indexes DECLARE @table AS VARCHAR(MAX) = 'Message'; DECLARE @sqlDisable AS VARCHAR(MAX) = ''; DECLARE @sqlEnable AS VARCHAR(MAX) = ''; SELECT @sqlDisable = @sqlDisable + 'ALTER INDEX ' + idx.name + ' ON ' + obj.name + ' DISABLE;' + CHAR(13) + CHAR(10), @sqlEnable = @sqlEnable + 'ALTER INDEX ' + idx.name + ' ON ' + obj.name + ' REBUILD;' + CHAR(13) + CHAR(10) FROM sys.indexes idx JOIN sys.objects obj ON idx.object_id = obj.object_id WHERE idx.type_desc = 'NONCLUSTERED' AND obj.type_desc = 'USER_TABLE' AND obj.name = @table; RAISERROR(@sqlDisable, 0, 1) WITH NOWAIT; RAISERROR(@sqlEnable, 0, 1) WITH NOWAIT; --EXEC(@sqlDisable); --EXEC(@sqlEnable); Now, with those indexes disabled, the simple delete operation took a lot less than a minute! Even in the case of our 21 million rows table, deleting 7 million rows took only 1:02 on my machine. Of course, after deleting the unwanted rows, you need to re-enable the indexes again which took another minute, but all in all I'm happy with the result. Copy Data to New Table and Drop Old Table One other way of deleting rows that I've used in combination with changing the table schema at the same time is the following: use a temporary table into which you copy all the rows you want to keep (the schema of which I modified to meet our new needs) delete the original table rename the temporary table to the original table's name recreate all indexes you had defined before This is basically what SSMS generates for you when you change the schema of a table, except for the indexes – you have to recreate them yourself. As you can imagine, this approach becomes faster and creates smaller transaction log footprint with a growing amount of data to delete. It won't have any benefit if you delete less than half of the table's rows. Choose the right tool for the job There are quite a few other approaches and tips out there on how to speed up your deletion process. It depends a lot on your concrete situation which of those will actually help you get your deletion job done faster. I had to experiment quite a bit to find the sweet spot but now that I've seen a few approaches I'm able to take a better decision in the future.
by Anton
17. June 2013 15:51
After hearing a little about how to go about increasing the performance of sites build upon Orchard at the Orchard Harvest, I did a quick search in the documentation. Two interesting pages came up: Optimizing Performance with a bunch of differen tips Caching with different types of cache Regarding Caching, the predominant opinion at the conference was, that one should always turn it on. We will look into SysCache, Memcache and other Modules like Cache. There was also talk about disabling dynamic compiling, but as of now we use this feature. Maybe we can circumvent our needs. One project has been demonstrated where Orchard is used as a CMS via the WebAPI of Orchard, but a “normal” MVC-Application is used to display this content. The reason for this was to achieve faster pageloads. Zoltan (as Piedone on CodePlex) has created a Combinator module to combine and minify stylesheets and javascript files. That could also reduce pageload time. We will look into these options as soon as our portal sites pick up traffic, and performance should be optimized.
by Oliver
17. June 2013 13:59
This is just a short post to draw your attention to a sweet tool I've just discovered: PNGGauntlet. It runs on Windows using the .NET 4 framework and is as easy to use as you could possibly wish. Also: it's completely free to use. Convert Your Existing PNGs For starters, we'll just convert some existing PNGs – can't really do any harm with that. In the Open File dialog, there's an option to filter for only .png files. You can choose many of them at once: If you provide an Output directory, the optimized files will be written to that destination. But: the tool also has the option to overwrite the original files, which is awesome if you use some kind of source control (and thus have a backup) and just want to get the job done. During my first run, using the 8 processing threads my CPU has to offer, … … I got savings from 3% to 27%: PNGGauntlet also tells me, that in total I saved 4,52 KB. If those were all images on your web site, that would be a not so bad improvement, especially when you get it investing about 2 min of your time and no extra expenses! Real Savings Running PNGGauntlet on the sprites that we use for Camping.Info, we were really surprised: out of 172 KB it saved us over 31%, a whole 54 KB! Now that's an improvement that on a slightly slower connection will already be noticeable. We'll definitely check the rest of our images for more savings. Convert Other Image Formats You can also choose to change your images format to PNG if you're in the mood. I tried converting all GIFs in the Orchard CMS Admin theme to PNGs and went from a total of 24 KB for 20 files to less than 17 KB with no loss of quality – an over 30% saving! Just beware that you'll need to change the file references in your project to pick up the new PNGs. Roundup Easy, fast and cheap (as in free) image optimization doesn't have to be magic anymore – today anyone can do it. Check out PNGGauntlet to try for yourself. There's really no excuse not to!