Thursday, May 31, 2007

Free access to historical military records until June 6

Peter Smith of The Courier-Journal points out that Ancestry.com, a genealogy site whose records are also useful for investigative reporters, is offering free access to its vast collection of historical military records until June 6, the anniversary of D-Day:

I've been able to find such things as an image of my grandfather's World War I draft registration card (he never was drafted, but it mentions that his eyes were blue, which I never knew because I've only seen him in black-and-white photos); the history of a great-great-grandfather's Civil War regiment; the Civil War pension card of another ancestor; several references to books mentioning an ancestor who was a captain at Bunker Hill; and a World War II draft card of a great uncle who had already fought in World War I, then gone into vaudeville -- a vocation that isn't surprising considering that he wrote "yes" on the line where the card said, "Telephone."

Anyone who wants to research military records on anyone should check this out before D-Day.

Ancestry.com says its records include "every major U.S. war from the Revolutionary War through Vietnam, including draft registration cards, veterans' gravesites, soldier pension indexes, enlistment records, muster rolls and much more. "

Crowdsourcing the immigration bill

I like this example of soliciting the wisdom of crowds. The truth laid bear has taken a PDF of the federal immigration bill and converted it into a format that makes browsing online easy. The site lets readers link to and comment on specific parts of the bill. "My hope, however, is that by presenting the bill in this form, I will help make the bill more accessible to all, and provide a central spot where commentary, criticism, and suggested improvements can be assembled," the site says. "Who knows --- maybe our erstwhile leaders on Capitol Hill will take notice, and take some of our comments to heart."


Wednesday, May 30, 2007

State government reference sites

Peggy Garvin of LLRX.com offers "a guide to some of the most useful, free web reference sources covering the governments of the fifty states." "These resources are not specific to a single state, but rather provide nationwide coverage of state information," she writes. "They can help you find state personnel, news, legislation, laws, regulations, policy updates, and statistics. The following list highlights helpful information available from many websites, but it is by no means comprehensive."

Tuesday, May 29, 2007

Why Mathematical Models Just Don't Add Up

The authors of "Useless Arithmetic: Why Environmental Scientists Can't Predict the Future," explain why the complex math used to justify many government actions doesn't add up:

Mathematical models are wooden and inflexible compared with the beautifully complex and dynamic nature of the earth. In the 1960s and 1970s — with the arrival of powerful personal computers, governmental requirements for environmental-impact statements, and widespread applications of mathematical models — scientists thought that quantitative models would be the bridge to a better, more secure future in our relationship with the environment. But they have proved to be a bridge too far.

We now know that there are no precise answers to many of the important questions we must ask about the future of human interaction with our planet. We must use more-qualitative ways to answer them.

Predictive quantitative models should be relegated to the dustbin of failed ideas.

The Statistical Modeling, Causal Inference, and Social Science blog comments: "While the article is quite extreme in its derision of quantitative models, plugs the book the authors wrote, and employs easy rhetoric by providing only positive examples of a few failures and not negative examples of many successes, it is right that quantitative models are overrated in our society, especially in domains that involve complex systems. The myriad of unrealistic and often silly assumptions are hidden beneath layers of obtuse mathematics."

morgueFile

morgueFile is a "public image reference archive":

The term "morgue file" is popular in the newspaper business to describe the file that holds past issues flats. Although the term has been used by illustrators, comic book artist, designers and teachers as well. The purpose of this site is to provide free image reference material for use in all creative pursuits. This is the world wide web's morguefile.


Thursday, May 24, 2007

Send your fat attachments through Gmail

If your email provider or corporate IT department blocks large attachments (and in my experience, most do) you'll appreciate this: Chris Otts points out that Google has just doubled the maximize attachment size Gmail will accept to 20 megabytes. "Now you can start sharing more of those home videos, large presentations and files you just can't seem to get smaller," Google says. I think it's safe to say it's no coincidence that Google just bought YouTube and has announced it will be unveiling its own Powerpoint-like software soon. You can, by the way, find some other options for transferring large files at the Reporters' Cookbook.

How to determine the winner of a multiple candidate field with Excel



I just struggled for ten minutes to show Andrew Wolfson how to compute an average in Excel, so I'm not optimistic many will make use of the following, but here goes.


Say there was just an election for governor and you have county-by-county vote totals for the six Democratic candidates. How do you determine the winner in each of the 120 counties, while also taking into account the possibility there could be ties?


Here's one solution, cadged from some online searches and scraps of old postings on NICAR-L. If anyone identifies a flaw in my work, please let me know. The formula looks like this:


=IF(COUNTIF(C2:H2,MAX(C2:H2))>1,"Tie",INDEX($C$1:$H$1,1,MATCH(MAX(C2:H2),C2:H2,0)))



The top row of the spreadsheet is arranged this way:


County Precincts Beshear Galbraith Henry Hensley Lunsford Richards Winner


Each subsequent row gives the vote totals for each candidate, in columns C to H. The formula, which you copy down each row in the winner column, looks in that row and if there is more than one largest number, as determined by the MAX() function, writes "Tie." Otherwise it takes the cell with the largest number, then, using a combination of the INDEX() and MATCH() functions, looks for the corresponding header of that column and writes that in the winner column.

I hope that's clear. If you have any questions about the details of how these functions work, consult Excel's help files or Wolfson.



Tuesday, May 22, 2007

Mining social networks for sources and stories

Way back when, if you wanted to find out where someone worked and where they went to school, who their friends were, what their hobbies were and what kind of music they liked to listen to, you had to knock on a lot of doors, make numerous phone calls, dig through city directories, stitch together documents or sit them down for a long interview. Now people volunteer all of that and more on social networking sites such as MySpace, LinkedIn, Facebook, Friendster and Bebo.

A social networking site is any site that attempts to make it easier for like-minded people to find each other. You typically share personal and professional details about yourself in online profiles, and link to and chat with others who share the same interests or the same circle of friends. There are countless such sites with millions of registered users.

If you think it's just teenagers, think again. There are at least three
such sites for doctors, for example, and many more for patients. There are sites for political activists, music fans and people who like to bake. When I searched MySpace for Frankfort, KY, I found multiple pages for people who said they worked for state government. Even politicians now feel obligated to join.

Journalists are mining these sites for sources and stories. Virginia Tech students, for example, wrote about the massacre on LiveJournal and other sites, provoking a virtual feeding frenzy by reporters covering the story. Social networking sites have led to stories about a convicted sex offender from New York chatting up children online, Kentucky kids charged with burglary after sharing a video of a break-in and a Houston police officer who thought it humorous to share photos of dismembered women.

If you haven't already, someday soon you will want to find someone on one of these sites. Here are some ways to do that:

  • Use the site's own search engine. Some sites, like MySpace, use Google as their search provider. Others have their own, home-grown tool. Policies vary from site to site, but typically you have the option of making your profile private, so only people you invite can look at it, or public, so anyone at all is welcome. You may not be able to join some sites and view profiles at all, unless you misrepresent yourself, because they're only for certain classes of people, such as students. Read the site's terms of use and consult the appropriate ethic's policy and your own moral compass before you act.


  • Use Google or Yahoo's site search. Typing site:myspace .com "Alex Davis" Courier-Journal in Google or Yahoo's search box, for example, will search MySpace for any mentions of Alex Davis and the Courier-Journal. If you try this search, by the way, you'll learn that Alex once trolled MySpace looking for people to talk about a coffeehouse.


  • Use a site devoted specifically to finding people on social networking sites. Here are some and how they describe themselves:


    • Better than white pages, Wink free people search lets you find people at social networks like MySpace, Bebo, LinkedIn & Friendster, and other online communities. Includes name search plus location, school, work, interests, and more.

    • yoName turns your computer into a private detective. Look for anyone you want. You can even look them up by a username or an email address! If they're on any of the big-time networks like MySpace or Facebook, yoName will find them. Look up friends, family, ex-es. Look up yourself and see if someone's impersonating you. Or just have fun and look up celebrities, even if the first five entries for Paris Hilton are all "male, 39, single, in Madison, Wisconsin".

    • ProfileLinker is an innovative web utility that allows you to link your social network profiles in one central location. You can also get message alerts from your favorite social networks, get updates on your friends, search for users across several networks, get your horoscope, weather, sports news and more.

    • Import your email address book and discover which of your friends are on social networks…

    • Explode is a social search tool that lets you find others online irrespective of which network they are on, as well as those running their own sites and blogs. It is a easy way to make connections, group these connections and interact with them either using your Explode profile or your own space somewhere else.

    • Discover, rate and share common interests with other communities around the world.


Pipl, a people search engine mentioned here a few days ago, also searches social networking sites.

The usual cautions apply: You can't assume anything you find is true, and you'll have to find verification elsewhere.

Also keep in mind that your snooping may not go unnoticed: If you search for an email address on yoName, for example, the site sends a message to that person telling them that they've been searched, although it doesn't say by who. There's also StalkerTrack, which helps MySpace users monitor people looking at their profiles. That doesn't mean they'll know your name or why you're looking at their page, but even when people write on Web pages accessible to anyone in the world, they persist in believing their words are somehow private.

Whether they remain so is up to you.

Saturday, May 19, 2007

This CJ story had a lot of holes

Some press criticism is bogus, but this comment by a reader of this morning's Web update, "Car runs into cemetery; woman hurt," was dead on:

typical shoddy reporting by the CJ.

Not one mention of the hundreds found dead at the scene.

Friday, May 18, 2007

An investigative technique so easy, even an 8-year-old can do it

Salon relays the story of how the 8-year-old son of a political scientist uncovered secret documents from the Coalition Provisional Authority, the defunct U.S.-led transitional government in Iraq:

My son made his discovery while impatiently waiting to play a computer game on my laptop. As part of a research project, I had downloaded 45 documents from a section of the CPA Web site known as Consolidated Weekly Reports. All but three of the documents were Microsoft Word. I had one of the Word documents up on my screen when my son starting toying with the computer mouse. Somehow, inadvertently, he managed to pull down the "View" menu at the top of the screen and select the "Mark up" option. If you are in a Word document where "Track changes" has been turned on, hitting "Mark up" will reveal all the deletions and insertions ever made in the document, complete with times, dates and (sometimes) the initials of the editors. When my son did it, all the deleted passages in a document with the innocuous name "Administrator's Weekly Economic Report" suddenly appeared in blue and purple. It was the electronic equivalent of seeing every draft of an author's paper manuscript and all the penciled changes made by the editors. I soon figured out that with a few keystrokes I could see the deleted passages in 20 of the 42 Word documents I'd downloaded. For an academic like myself it was a small treasure trove, and after I'd stopped hooting and hollering it took some time before I could convince my startled son that he hadn't done anything wrong.


Maybe the CPA should have read Microsoft's advice on how to prevent this sort of thing.

Wednesday, May 16, 2007

Pipl, a people search engine

Pipl claims to offer "The most comprehensive people search on the web." This is how the search engine answers when it asks itself the question, "What is so different about pipl?":

Pipl's query-engine helps you find deep web pages that cannot be found on regular search engines.

Unlike a typical search-engine, Pipl is designed to retrieve information in real-time from the deep web, our robots are set to interact with searchable databases and extract facts, contact details and other relevant information from personal profiles, member directories, scientific publications, court records and numerous other deep-web sources.

Pipl is not just about finding more results; we are using advanced language-analysis and ranking algorithms to bring you the most relevant bits of information about a person in a single, easy-to-read results page.

I searched my name and among the sites where Pipl found mentions of it, though not necessarily mentions of me, were Yahoo People, WhitePages.com, PeopleData, MySpace, Peoplefinders.com, TheScoop.org, the N&R News Research blog, ZoomInfo, LinkedIn, Blogger, and the SPSS mailing list. Other data sources the site suggests it searches includes Hoovers, icq, the SEC, Amazon, Infospace, Reunion.com, Friendster, Flickr and LexisNexis.


Tuesday, May 15, 2007

Flickrvision

Twittervision is a hypnotic way to watch lives unfold via Twitter. Now there's Flickrvision, which does same thing for Flickr. It maps and displays photographs as they are uploaded to the photo sharing site from around the world.

Monday, May 14, 2007

LawBeat

is a blog hosted by the S.I. Newhouse School of Public Communications at Syracuse University:

LawBeat watches the journalists who watch the law. It is meant to start a conversation — here and in the classroom — about the quality of journalism focusing on the justice system, lawyers and the law

The primary author is Mark Obbie, the director of the Carnegie Legal Reporting Program at the school. If it interests you you'll also want to check out Obbie's "10 Deadliest Sins of Legal Reporting."

Friday, May 11, 2007

You're a Nobody Unless Your Name Googles Well

So says The Wall Street Journal:

In the age of Google, being special increasingly requires standing out from the crowd online. Many people aspire for themselves -- or their offspring -- to command prominent placement in the top few links on search engines or social networking sites' member lookup functions. But, as more people flood the Web, that's becoming an especially tall order for those with common names. Type 'John Smith' into Google's search engine and it estimates it has 158 million results.

For people prone to vanity searching -- punching their own names into search engines -- absence from the first pages of search results can bring disappointment. On top of that, some of the 'un-Googleables' say being crowded out of search results actually carries a professional and financial price.

I'm guessing this isn't a problem for Rupert Murdoch.

Gmaps Pedometer

Reporters often need to describe the distance between two points. Courier-Journal reporter Chris Otts recommends the Gmaps Pedometer, which makes doing that easy. You just click to add points to the map and it tells you the distance between them. Intended for runners, it also lets you save and bookmark routes, compute calories burned and export to GPX, a way of exchanging GPS - Global Positioning System – data.

CubReporters.org

CubReporters.org, founded by Mark Grabowski, a third-year law student at Georgetown University who has reported for The Providence Journal, Arizona Republic and other newspapers, "is an online career guide aimed at young, college and early career journalists":

… even veteran journalists may find some of our features, such as the jobs page, useful. Launched in April 2007, the site gets more than 10,000 page views per week. CubReporters.org has been mentioned in MediaBistro, Online Journalism Review and several publications.

Unless you go to a journalism school like Columbia, Medill or Newhouse, your university may offer little assistance or know-how on journalism career development and job placement. I found myself in this position ten years ago, when I was in college and wanted to become a newspaper reporter.

Having been through the process, I developed this site to help other young journalists get their first byline, internship or job.

Thursday, May 10, 2007

Visuwords

Visuwords is an "online graphical dictionary." It visually represents a word, showing what type of word it is, words it's similar to, its derivation and more.

Tuesday, May 8, 2007

The mythical national criminal record check

Reporters, who should know better, too often think that they can do a thorough criminal background check by searching a few online databases. It ain't so. The Virtual Chase reprints an article by a law librarian that explains why: "National Criminal Background Checks: Myths, Realities & Resources."

Anyone who has been asked to conduct a "national criminal background check" knows the sinking feeling that comes from facing the requestor's confident assumption that such a request is reasonable, possible, inexpensive, and fast. When a Google search brings up dozens of hits containing words like comprehensive, instant results, free, and all 50 states, it is easy to see where that confident assumption comes from. Where to start to explain all the caveats, cautions, costs, and prohibitions?

Directory of nationally certified teachers

The National Board for Professional Teaching Standards offers a directory of 55,000 teachers who have earned its national certification. Search by state, city, district, year, name or certification area. It returns the district and city where they teach but not the school.

Friday, May 4, 2007

3D graphics with GE-Graph

The free GE-Graph plots data graphically on Google Earth. Here's an example: I took 2000 Census data on median home price household income by ZIP code for our circulation area, used Microsoft Access to combine it with a file that gave the latitude and longitude for each ZIP code, and cut and pasted it into GE-Graph. GE-Graph can also import data from a file or you can type it in, and it has various options for choosing colors, labels and transforming the data. You then click "Run" and it exports the data as KML, Google's mapping format. This was the result:





The taller the bar and the darker the color, the higher the median home price median income. You can zoom in and out and rotate the graphic to view it from different angles in Google Earth. This is a kind of poor man's 3D Analyst Extension, which does this and much, much more, but costs $2,495.


UPDATE: Oops. The perils of working too fast and not taking enough time to check your work: I mistakenly uploaded a 3D image of home prices instead of median income, so I've corrected this post to reflect that.

Thursday, May 3, 2007

Calculating the difference between two dates in Excel

Cpearson.com calls Excel's mostly undocumented DATEDIF function "one of the drunk cousins of the Function Family."

"Excel knows he lives a happy and useful existence, and will acknowledge his existence when you ask, but will never mention him in 'polite' conversation," the site says.

The function calculates the number of years, months or days between two dates. It's especially useful for calculating someone's age given their date of birth, a common journalistic task.

The format of the function is:

=DATEDIF(startdate, enddate, period)

So

=DATEDIF("1/1/1960",NOW(),"y")

will return the number of years since January 1, 1960

=DATEDIF("1/1/1960",NOW(),"m")

will return the number of months. And

=DATEDIF("1/1/1960",NOW(),"d")

the number of days.

The function works in all versions of Excel back to version 5.0, and Microsoft mentions it on its site, but it isn't included in the help files except in Excel 2000. "The DATEDIF function has its origins in Lotus 1-2-3," The Spreadsheet Page says. "Apparently, Microsoft included it in Excel for compatibility purposes. Why is it not documented? Who knows? But it's likely that lawyers are involved."

I don't know if that's true, but it certainly has the ring of truth when you consider the official explanation of how to calculate the number of years between two dates using the functions that are mentioned in Excel's help files.

That formula looks like this:

=YEAR(A2)-YEAR(A1)-IF(OR(MONTH(A2)<MONTH(A1),AND(MONTH(A2)=MONTH(A1),DAY(A2)<DAY(A1))),1,0)

Now which would you prefer?

Make Web applications easily with Zoho Creator

Zoho Creator is a free tool for creating Web applications. You use it to create Web forms that allow people to submit data, process the data, then make it searchable in an online database. You can share your applications with the world, or keep them private and make them available to only a few. You can also embed these applications in your own Web site or blog. Typically, this is the sort of thing you'd do with a programming language like PHP and a database like MySQL. Zoho Creator, however, makes it ridiculously easy to do without specialized skills. Zoho Creator does have its own simplified programming language called Deluge ("Data Enriched Language for the Universal Grid Environment") that you can use to customize your creations, but it's not required. As a demonstration, I've created a simple application that allows people to submit interesting Web sites to Depth Reporting, emails the submissions to me and makes them viewable and searchable on Zoho. If you have something interesting to share, please submit it in this form:



Given that only about 120 to 130 people a day get Depth Reporting's feed, I'm not expecting much, but we'll see.

UPDATE: If you are reading this in an RSS reader, you won't be able to see the form, so you'll have to visit the blog page itself. The code doesn't work in feed readers, which I should have checked before posting.

Wednesday, May 2, 2007

My little piece of the Anna Nicole Smith saga

An editor called me yesterday and asked, Didn't you once write that it's possible to track an airplane in flight? Yes, in fact, I did. The editor wanted to know if I could see where Larry Birkhead and his daughter, Dannielynn, were after taking off from Nassau in the Bahamas. (Non-Louisville readers of this blog should know that Birkhead is from Louisville and met Anna Nicole Smith at a particularly posh and celebrity-studded Kentucky Derby party here known as the Barnstable Brown Gala). We were too late, because Birkhead's plane, also carrying a crew from Access Hollywood, had already landed in Louisville. Ah well, fame is so fleeting ...

Tuesday, May 1, 2007

Tracking federal dollars on the Web

LLRX.com writes about the ins and outs of tracking federal dollars using Web databases:

Many of us in a position to be asked do not look forward to the inevitable questions that run along the lines of "did Organization X ever get any federal money?" or "how much do the feds contract out in industry Z?" The problem is that there will seldom be a simple, client-pleasing answer like "oh yes, $52,453,000.75 in fiscal year 2006, according to this single, comprehensive, and authoritative government database." We have to be familiar with a variety of sources, their fundamental strengths and weaknesses, and the ways in which we will have to qualify our answers.