Depth Reporting

Showing posts with label Spreadsheets. Show all posts
Showing posts with label Spreadsheets. Show all posts

Tuesday, May 6, 2008

Digital File, a database for investigative reporters and researchers

image

Haven't tried it myself, but Digital File costs about $45 and was recommended by a reporter on NICAR-L. It's for organizing investigations:

It is bases on Excel and runs on every PC and Mac. The database helps keeping track of an investigation. All steps are documented in a way that allows quick access and overview. The database contains contact info about sources, questions to ask, documents and (interview) notes, as well as a time tracker and expenses sheet. ‘After years of muddling in Word, this really is a solution!’

The creator is Luuk Sengers, a freelance investigative reporter and journalism lecturer in The Netherlands.

Monday, April 28, 2008

Pivot tables and macros in Zoho spreadsheets

A video explains the new features, which also include the ability to import up to 100,000 rows of data. You can write the macros in Visual Basic, share them online and import Excel spreadsheets with existing macros. You can't do everything you can with Excel, though, as both the macros and the pivot tables offered by Zoho are more limited.

 

 

[via TechCrunch]

Saturday, March 22, 2008

Microsoft Excel as a 3D game engine

This is either an advanced form of psychosis or genius: This guy demonstrates how to use Excel to make animated 3D graphics:

Integration of computer games and spreadsheets has tightened during the evolution of computer technology.

At an early stage this integration among the the games and spreadsheets was comical, e.g. they were installed on the same hard disk, or the purchased games were listed in an Excel sheet. Later the integration has tightened, as some games introduced a built-in spreadsheet (accessible by the "boss key" feature) - or Excel contained some built-in 3D games as Easter Eggs.

Now we have arrived at the next step of this integration, as Excel's cutting-edge 3D functionality is not hidden in Easter Eggs anymore but can be accessible publicly and easily. Excel has grown up and started its conquest as a revolutionary 3D game engine.

(When he talks about Easter Eggs, by the way, he's not talking about what the bunny will pass out Sunday. He's taking about secret features hidden in software.)

[via]

Tuesday, February 26, 2008

Add Data to Google Spreadsheets Using Forms

Not everyone is comfortable entering data in a spreadsheet, but anyone who has been online has filled out a form. Now you can gather data in Google Spreadsheets using forms. From the Google announcement earlier this month:

Google Spreadsheets has a new feature that lets you create a form to accept data. When you go to the Share tab, there's a new option to "invite people to fill out a form". The form is very simple and can be customized by changing the order of entries, their labels and the type of answers. It's also a nice way to get feedback people who wouldn't normally collaborate on a spreadsheet.

You can create forms from spreadsheets or using this URL. To keep track of your forms, add this gadget to iGoogle.

Monday, December 3, 2007

ExcelTips: Working with Elapsed Time

From the summary:

Since Excel stores times as numeric values, you can perform arithmetic operations on those values. This allows you to work with elapsed time, but doing so can present some results that may be confusing.

Tuesday, November 27, 2007

Chart Chooser

image

Juice Analytics' Chart Chooser helps you figure out what type of chart is appropriate for your data. You can then download a ready-made Excel or Powerpoint template to create it.

Friday, October 19, 2007

Free Excel Training Course at The Reporters' Cookbook

The Reporters' Cookbook is a still nascent wiki where reporters and fellow-travelers can "share code, examples, tutorials and other bits of information related to the practice of journalism, especially computer-assisted reporting." Yours truly is an administrator, although my most recent contribution was to write a script to battle an infusion of Chinese-language spam. Christopher Schnaars of The Morning Call has just contributed a free Excel training course, which can be found on The Cookbook's Excel page. There's a link there to download the course, which includes handouts and data sets for training others or yourself on how to use Excel. Schnaars is also writing a tutorial on using ASP.NET to put databases on the Web.

Thursday, September 6, 2007

Import and republish data using Google Spreadsheets

Google Spreadsheets now lets you import data from other sites, which you can then crunch and republish. You can import data from comma- and tab-delimited files, XML files, Web tables or lists and Google Reader.

To test it I copied the URL for the monthly civilian unemployment rate from FRED into a Google Spreadsheet cell, using one of Google's new import functions:

=ImportData("http://research.stlouisfed.org/fred2/series/UNRATE/downloaddata/UNRATE.csv")

Then I created a chart from the data, made the sheet public, and copied the code to embed the chart into another Web page. The chart is dynamic, so when the underlying data in Google Spreadsheets changes, the chart changes too. You can also choose to embed the spreadsheet itself, although in this case I chose not to. Each public spreadsheet also has its own feed, so someone can subscribe to your data and be notified automatically whenever it's updated.

All this is easy to do, assuming you know something about spreadsheets and data files. My biggest gripes were that editing the spreadsheet was balky on my modest home DSL connection, and I couldn't figure out how to make dates appear on the x axis of my chart (perhaps because there were too many). I also tried embedding the chart using the awful Google Page Creator, but it didn't work.

I don't know enough to know if this will be a viable way to serve data on the newspaper's Web site, but combined with the ability to manipulate spreadsheets with code, Google is become an ever more intriguing data platform.

Tuesday, September 4, 2007

Spreadsheet Addiction

The proprietor of Burns Statistics explains:

Some people will think that the "addiction" in the title is over the top, or at least used metaphorically. It is used literally, and is not an exaggeration. Addiction is the persistent use of a substance where that use is detrimental to the user. It is not the substance that is the problem -- more limited use may be beneficial. It is the extent and circumstances of the use that determine if the behavior is addictive or not.

Spreadsheets are a wonderful invention. They are an excellent tool for what they are good at. The problem is that they are often stretched far beyond their home territory. The overuse of spreadsheets is only too common.

Friday, July 6, 2007

How to check your Excel formulas

It's easy to make mistakes in Excel because a formula points to the wrong cells. Mr. Excel shares some tips for auditing them.

Thursday, June 14, 2007

The world began in 1900, Excel says

The Open Malaysia blog highlights an Excel flaw I've experienced myself: Its belief that dates began in 1900. He says this isn't a problem in OpenOffice:

Obviously if you have the money to spend, by all means, you are free to purchase Microsoft Office 2007. However please avoid the native file formats of those products if you are a Islamic historian, Renaissance archivist, Medieval coin collector, or someone who just has to work with dates prior to the 20th Century.

Microsoft offers a macro for calculating an age from a pre-1900 date. Excel User Tips also has an add-in to cope with the problem, but the site says it's inaccurate for dates before 1752 because of differences between American, British, Gregorian and Julian calendars. I don't have a clue if that's also an issue with OpenOffice or the macro.

Saturday, June 9, 2007

Software to fix errors in spreadsheets

An Oregon State University press release says computer scientists there have developed a new system for correcting errors in spreadsheets:

"Most users of spreadsheets are overconfident, they believe that the data is correct," said Martin Erwig, an associate professor of computer science in the OSU College of Engineering. "But it has been observed that up to 90 percent of the spreadsheets being used have non-trivial errors in them. In fact, one auditor has said he never inspected a single spreadsheet during his entire career that was completely accurate."

Sometimes the result is a paycheck delayed or a few dollars misplaced. But often the costs or financial misrepresentations are far more serious, and companies have lost millions or billions of dollars, Erwig said, occasionally drawing notice and ridicule in the national press.

The system is called GoalDebug, "which stands for 'Goal Directed Debugging of Spreadsheets.'' What it does "is try to identify the ways that humans commonly make mistakes and then suggest what the correct approach might have been. For instance, if someone sees a figure in a spreadsheet that seems suspicious or is clearly incorrect, they can plug in the correct number, and the OSU system can suggest several programming mistakes that might have created the error – which the user can then sort through and use to identify the problem," the press release says. OSU claims the system could help companies "save billions of dollars" -- a prediction, I'm sure, that has nothing to do with that fact that it's been "licensed to a spin-off company in Oregon."

Thursday, May 24, 2007

How to determine the winner of a multiple candidate field with Excel

I just struggled for ten minutes to show Andrew Wolfson how to compute an average in Excel, so I'm not optimistic many will make use of the following, but here goes.

Say there was just an election for governor and you have county-by-county vote totals for the six Democratic candidates. How do you determine the winner in each of the 120 counties, while also taking into account the possibility there could be ties?

Here's one solution, cadged from some online searches and scraps of old postings on NICAR-L. If anyone identifies a flaw in my work, please let me know. The formula looks like this:

=IF(COUNTIF(C2:H2,MAX(C2:H2))>1,"Tie",INDEX($C$1:$H$1,1,MATCH(MAX(C2:H2),C2:H2,0)))

The top row of the spreadsheet is arranged this way:

County Precincts Beshear Galbraith Henry Hensley Lunsford Richards Winner

Each subsequent row gives the vote totals for each candidate, in columns C to H. The formula, which you copy down each row in the winner column, looks in that row and if there is more than one largest number, as determined by the MAX() function, writes "Tie." Otherwise it takes the cell with the largest number, then, using a combination of the INDEX() and MATCH() functions, looks for the corresponding header of that column and writes that in the winner column.

I hope that's clear. If you have any questions about the details of how these functions work, consult Excel's help files or Wolfson.

Thursday, May 3, 2007

Calculating the difference between two dates in Excel

Cpearson.com calls Excel's mostly undocumented DATEDIF function "one of the drunk cousins of the Function Family."

"Excel knows he lives a happy and useful existence, and will acknowledge his existence when you ask, but will never mention him in 'polite' conversation," the site says.

The function calculates the number of years, months or days between two dates. It's especially useful for calculating someone's age given their date of birth, a common journalistic task.

The format of the function is:

=DATEDIF(startdate, enddate, period)

So

=DATEDIF("1/1/1960",NOW(),"y")

will return the number of years since January 1, 1960

=DATEDIF("1/1/1960",NOW(),"m")

will return the number of months. And

=DATEDIF("1/1/1960",NOW(),"d")

the number of days.

The function works in all versions of Excel back to version 5.0, and Microsoft mentions it on its site, but it isn't included in the help files except in Excel 2000. "The DATEDIF function has its origins in Lotus 1-2-3," The Spreadsheet Page says. "Apparently, Microsoft included it in Excel for compatibility purposes. Why is it not documented? Who knows? But it's likely that lawyers are involved."

I don't know if that's true, but it certainly has the ring of truth when you consider the official explanation of how to calculate the number of years between two dates using the functions that are mentioned in Excel's help files.

That formula looks like this:

=YEAR(A2)-YEAR(A1)-IF(OR(MONTH(A2)<MONTH(A1),AND(MONTH(A2)=MONTH(A1),DAY(A2)<DAY(A1))),1,0)

Now which would you prefer?

Friday, December 1, 2006

Google Spreadsheets mashups?

I've never used Google Spreadsheets to do anything serious, so I can't comment on how well it works, but improvements rolled out this week are intriguing. One is that Google has made the spreadsheets programmable, so you can write software code that adds, deletes or manipulates data. And it has added its own "GoogleLookup function," which lets you pull data from a Google query directly into the spreadsheet. For example, here's a demo pulling data on newspaper company employment using the formula, =GoogleLookup("newspaper_co_name","employment"). As you can see, the queries don't always work, for whatever reason, but you have to think that this will get better over time. You can also update the spreadsheets continuously as news data is added. All of this could spawn some intriguing data mashups.

Tuesday, November 28, 2006

Advanced Excel charting

ProcessTrends.com shows how to use advanced Excel charting techniques to visualize data.