Spreadsheet applications are a tough program to beat. Little compares to the ease of use and the flexibility a spreadsheet provides. It is the place to start when performing an analysis, to prototype an idea or create an engine to convert data into useful information. Yet all too often I see spreadsheets designed in a way that hinder their usefulness. I am trying to be generic when I say spreadsheet application. The gold standard is Microsoft Excel for Windows—perhaps the one application Microsoft gets right. It is perhaps the one reason I run Windows virtually on my Mac. There are other applications out there. One that I am starting to adopt is the free OpenOffice Calc.
I believe there is an art and science to building spreadsheets. There is also a science in how spreadsheets should be deployed in the enterprise.
Below I will highlight the following:
Spreadsheet design:
Spreadsheet in the enterprise
The Pharos Art of the Spreadsheet
Where the art of the spreadsheet is most pronounced is in the creation of an engine. In power plant management an ideal application of a spreadsheet engine is a bid tool. The asset manager can enter fuel prices, market information, and unit capabilities into a spreadsheet. Then select a variety of bidding strategies to produce an output file that can be uploaded to an ISO. Philosophically, I am opposed to using a black box or bid generation application other than a spreadsheet since a spreadsheet provides the analyst with the tools to make modifications on the fly and truly understand the components of the bid. Managing power plants is a dynamic process. Bid strategies and unit operating characteristics can change and a tool needs to be just as flexible. Spreadsheets also provide an added level of transparency. One can pick up a spreadsheet and have a very good understanding of everything that is happening.
Here is my approach to designing a spreadsheet. I divide my workbooks into different types of sheets. I have a front end, manual input, automated input/data tables, global variables, transformation and calculation sheets and reporting and export. Each one has its purpose and makes it easy for a someone to follow along. Yes it can contribute to bloat but I believe having a well laid out spreadsheet is most important.
"Front End" tab. My first sheet is called “Front End.” It drives the spreadsheet and stores the most important global variables like the date for all calculations and as well as all buttons to run the macros and macro information.
Manual entry tabs. I dedicate a sheet or sheets to manual entry. Each cell that requires a manual entry would be highlighted a color. By consolidating all manual input to a few tabs you reduce the odds of an error of someone forgetting to input data. You also make it easier to do all manual entry in one place. It also makes it easier to capture and archive manual data. Say you want to upload all manual inputs to a database. Centralizing the manual inputs makes that an easy process.
Data tabs. These are tabs dedicated to external data feeds such as settlement information or a price download from the ISO. Each data feed has its own sheet. For example if you are importing day ahead prices where each day is a separate file, append each file to the day ahead prices tab. That tab should only be used for day ahead prices. Leave the first column empty for a key. The key is a unique identifier that will allow you to cross reference/lookup your data for your calculation. Typically the key would be a combination of date, hour, perhaps the generator ID or anything that makes looking up a unique row possible. The main advantage of having your data in one easy location here is you can more easily track issues by having all data of a certain type in one place. Plus you can more easily import and export that data from another flat file or database.
Transformation/calculations tabs. Raw data is often in a format not easily utilized in calculations. The date may go across but you need it going down. Transformation and calculation sheets format the raw data in a useable form and then perform the needed calculations. You will need to use lookup formulas to pull the data from your data tabs. Note it is the lookup formulas that are most computationally intensive. It can make what should be a one megabyte workbook become a twenty megabyte workbook or something that takes a second to recalculate take two minutes. I talk about strategies for dealing with that later.
The calculation tabs should be easy for someone to follow and understand how data is being manipulated. Break out complex calculations into components. If a calculation has five terms to it, each with several calculations behind them, break each component to a separate cell. It will be easier to follow and debug.
Fixed global variables tab. Have a sheet for those global variables that do not change. For example, asset IDs, lookup tables for the validation feature, key dates, etc.
Typically a user would never go to the calculation, data or fixed global variables tabs.
Data reporting/export tab. Have a separate sheet to format the data for export or produce the graphs for pdf or printing. These tabs would reference the calculation tabs and make the data more presentable or useful for the next application.
While setting up the workbook keep in mind that the spreadsheet becomes an ideal database. It’s a great strategy to think about how to format the workbook so someone can later mine that information and make it easier to extract. If the data and calculations do not occupy much space it may be worthwhile maintaining a rolling database. For example, there may be a daily revenue calculation that typically one would store in a workbook dedicated to a month. It may make more sense to continually append data to that workbook so someone can view a full year.
It may also make sense to add information that is not necessarily relevant but is related. If a settlements calculation requires the day ahead prices for a few generators, it may be worthwhile capturing all prices points used within an organization for both the day ahead and real time and storing that in a separate file. This way one source has been generated and other settlements calculators or analyses can feed from the same source.
Usability comments:
An important message about macros: Make your macros derive directions from data in the spreadsheet. The worst thing you can do is hard code something such as where to find a file or where to paste data unless it is information that will never change. A classic example is do not have a specific external webpage or worksheet referenced in the macro. Instead have it look up the external address from a cell within the spreadsheet. This way if something changes the user can change it and not have to rely on the programmer to change it. It also allows the end user to verify the location of the file or webpage to perform their own debugging. Creating a good spreadsheet pushes the maintenance from the programmer to the analyst.
An important message about external links: I think it is better to store the data internal to the application than link up to a spreadsheet. Create a macro to pull the data. Having all the data you need locally makes it easier to track errors. The one issue is making sure the data is up to date. Keeping data current can be mitigated by added error checking and version control. When you import data publish the date of the import on the front page. There may be tags in the data that tell you when the data was published which you can reference. For example, have a formula that looks at the max date published in the data sheet. If that date is less than the date of the report it should be a sign to reload the data. The other problem with external links is behavioral. I have observed in many instances users ignoring the message to update an external link or the link is broken and they do not attempt to fix it. They end up linking to stale information that is not being updated.
Storing data internally may add to spreadsheet bloat but the benefits outweigh the cost of disk and memory space.
Error checking is very important. It is always good to program data sanity checks. Add rule of thumb calculations to verify that the numbers you are calculating are in the right ballpark. Add checks to make sure your data is up to date. For each data feed have a summary describing what’s in it. Have the spreadsheet highlight problems using color coded conditional formatting.
This method I am proposing is very computationally intensive. It is geared for data that is not always the same. One day a data feed could be 50 rows another its 1000. You need to have formulas that can handle this. Lookup formulas that look up a specific key is the best way to handle it. (Side note VLOOKUP and HLOOKUP are easy to use but are not always most appropriate. Using a combination of MATCH and OFFSET functions achieves the same result with more flexibility).
One tip to reducing calculation time is first turn off recalcutation. But that does not reduce spreadsheet bloat. The best thing to do is delete the cells or copy-paste special-values with computationally intensive formulas. For example, say you have 744 rows of MATCH and OFFSETS. Delete all but the first or second row or better yet copy and paste special values. The formulas become a value. When you need to perform calculations just fill down the formulas from above, recalculate and then paste values again to “freeze” the values. It can make the difference of a 3 second recalculation or a 60 second recalculation or a ten megabyte file versus a forty megabyte file. I will typically code up macros to freeze and unfreeze calculations.
Another strategy is to create a “transformation” workbook specifically for formatting data the way you need it. Then export that data to another workbook to perform calculations. It may add a step but it will greatly increase the handling of your spreadsheet. If you have access to multiple computers you can run these processes in parallel.
Spreadsheet in the enterprise
Every so often I hear people say “we have to get out of all these spreadsheets” and move to a more “robust” or enterprise level application. There is a danger in that. It is recommended only for that which is static and does not change structurally. For bid tools and settlements applications excel is the best option. Bid strategies change. In settlements there are special charges or new rules that appear. Having an enterprise level application only reduces the flexibility to adapt to these changes. It increases the odds of failure and dependency (now IT staff is required to maintain and add functionality). It also requires a direct connection to servers. Having your bid tools on a laptop allows you to create and submit offers from anywhere you have an internet connection and not have to rely on the corporate connection working or responding quick enough. This plays well into disaster recovery and business continuity. I can put the relevant spreadsheets on my laptop and have all I need to manage the business. When I regain a connection to the server I can upload my work. When I used to manage power in New York City we had a small team managing the assets. We had to be flexible and light. With my laptop and a cellular card I did not need a connection to the corporate network. I was able purchase fuel and update our bids to the market all from a taxi in midtown Manhattan.
Spreadsheets are great tools for interacting with enterprise systems. If we think of the spreadsheet as an engine we can use the spreadsheet to create feeds that we then export to an enterprise database. For example, you may have a proprietary pricing engine that takes forward curves for power and gas and produces a hybrid curve for your models. The analyst can add the components, tweak the values, then by running a macro upload the values to a database.
If you would like more examples or assistance with your spreadsheets email us at info@pharos-ei.com.