Tuesday, December 27, 2005

Finding stolen public records on the Web

Federal Computer Week reports that historians and the National Archives and Records Administration are looking for missing and stolen government records at auction house Web sites. It says they have found about 150 "questionable items" since starting three months ago.

Social network analysis and the Wikipedia

The free Wikipedia is not only an encyclopedia -- it's also a massive dataset. Three researchers, including two from Indiana University, have analyzed and mapped "the semantic structure of the English Wikipedia as well as the activity of its major authors." This included using Pajek, social network analysis software some journalists have experimented with as a news reporting tool. In their introduction, the authors of the paper on Wikipedia say:

"Prior research has shown that particular areas of science are not driven by single authors but by effectively collaborating co-authorship teams – a global brain seems to be emerging on this planet. This has been interpreted as good news as human brains are assumed to not scale to process, understand and manage the amounts of information and knowledge available today. However, teams might be able to dynamically respond to the increasing demands on information processing and knowledge management."

Federal Wiretap Reports

The Administrative Office of the U.S. Courts offers wiretap reports summarizing by state and county "applications for orders authorizing or approving the interception of wire, oral, or electronic communications." Apparently there's not much interest in what Kentuckians and Hoosiers are saying, however, because a quick look for Kentucky and Indiana found nothing.

IRS Tax Statistics

Tax Stats by the IRS offers "a wide range of tables, articles, and data that describe and measure elements of the U.S. tax system."

Sunday, December 25, 2005

Good and Bad Procrastination

Programmer and writer Paul Graham says: "The most impressive people I know are all terrible procrastinators. So could it be that procrastination isn't always bad?"

Thursday, December 22, 2005

State and Local Government on the Net

State and Local Government on the Net is "A directory of official state, county, and city government websites."

PI News Link on researching private companies

PI News Links suggests using the Small Business Administration's "Small Business Source System Database" to research private companies. This database of firms certified for various SBA programs gives you contact information, an email address, the company's Web page, year established, number of employees, gross revenues, whether it "Accepts a Government Credit Card," and its DUNS number, which is a unique identifier Dun and Bradstreet uses to identify businesses. I found information on 80 companies in The Courier-Journal's ZIP code, but beware, the SBA warns that the database is a "self-certifying database" and it can't guarantee its accuracy. PI News Link also offers a useful tip on how to use Google to find if a private business has been involved in litigation with the EEOC.

Wednesday, December 21, 2005

The United States of Ajax

This slick map shows what can be done with Yahoo's map programming tools.

Tuesday, December 20, 2005

Windows Live Local one-ups Google

Rich Schiefer suggested I take a look at Windows Live Local. This is Microsoft's new map site - previously it was called Virtual Earth - and it's designed to compete with Google's map service, called Google Local. Like Google Local, Windows Live Local lets you click and drag the map around with your mouse, get driving directions, and see map and satellite views. Microsoft's satellite images, however, are of a much higher resolution so you can zoom in much closer, at least for the areas in Kentucky and Indiana I looked at. Windows Live Local also lets you easily add your own custom pushpins to highlight a location, including the ability to add your own notes, then point someone else to the URL so they can see how you marked up the map. Here's an example using the satellite image of the CJ building. For some cities, but not Louisville, you can also get “birds-eye views,” which "provide a high-resolution, low-angle aerial view of a small area.” You can also customize how you print out driving directions, including adding your own notes, such as a description of what your house looks like, so a visitor can find it more easily.

Monday, December 19, 2005

MedlinePlus: School Health

MedlinePlus, by the U.S. National Library of Medicine and the National Institutes of Health, has a page devoted to school health. It includes pointers to news stories, overviews of school health issues, nutrition information, research, statistics and more. From personal experience I can attest to the truth of its top news story today: "Heavy, Poorly Positioned Backpacks Hard on Kids." But I won't burden you with the details of why my 7-year-old son insists on carrying an oversized wooden nutcracker to school ...

Milblogging.com: Index of military blogs

Milblogging.com bills itself as "The World's Largest Index of Military Blogs." The site says it was founded by a veteran of Operation Enduring Freedom who had blogged from Afghanistan and who wanted to make it possible to find military blogs "in fewer than 8,975 clicks." "Now, you can find milblogs by gender, and branch, and country, and much much more in less than five clicks," the site says. Thanks to Jim Malone for telling me about it.

Friday, December 16, 2005

Google adds new music search features

Google has added new music search features. "Now you can search for a popular artist name, like the Beatles or the Pixies, and often Google will show some information about that artist, like cover art, reviews, and links to stores where you can download the track or buy a CD via a link at the top of your web search results page," its blog says.

PhD blogs

PhDweblogs.net says it is a "is a non-profit initiative to bring together PhD students' weblogs from all around the world." Anyone preparing for a PhD -- and people with other research-related blogs - can register at the Portuguese-based site. It currently has 357 registered sites, which can be browsed by research field, country or language.

GoshMe, "The Web Search Assistant"

GoshMe calls itself a "Web Search Assistant": "Once the user sends us his/her query, we will check all Search Engines possibilities for him/her, and present it in the most comprehensive way, providing a list of all Search Engines and Databases appropriate to his/her query, ranked by relevance, divided by categories and sub-categories, and with a brief description about each Search Engine."

Thursday, December 15, 2005

Nature investigation shows Wikipedia "close to" Encyclopaedia Britannica in accuracy

Nature magazine compared the Wikipedia to the Encyclopaedia Britannica and concluded that "Wikipedia comes close to Britannica in terms of the accuracy of its science entries." Nature said their investigation "revealed numerous errors in both encyclopaedias, but among 42 entries tested, the difference in accuracy was not particularly great: the average science entry in Wikipedia contained around four inaccuracies; Britannica, about three." A Slashdot discussion of the article is here.

Beverly Bartlett: "Smart Books For Smart Women"

Beverly Bartlett, a former CJ writer and friend, now has her own Web site promoting her upcoming books, Princess Izzy and the E Street Shuffle and Cover Girl Confidential. Make sure you read the answer to the question, "Who is Beverly Bartlett?." And on her Frequently Asked Questions page she answers the vital question: "How come I can't find Bisbania on a map?"

Topix.net redesigns, adds blogs, community news

Yet another sign that a new media landscape is emerging: Topix.net, a news aggregator whose owners include Gannett, has begun including blog posts in its news feeds. Topix.net is what we use to feed Kentucky and Indiana headlines to our intranet homepage. It's odd to see posts from Kentucky blogs suddenly appearing among the headlines generated by the AP, the CJ, TV stations and other mainstream news outlets. Topix.net, which says it wants to provide "an intuitive, easy way to find the targeted news that is relevant to you," this week also unveiled a redesign that makes it easier to customize the home page the way you want. It is also experimenting with "community news" - including allowing users to contribute articles to the site, and adding local forums for every city and town in the U.S. and Canada. A press release quotes Rich Skrenta, the CEO:

"Our new capability incorporates the growing expectation of the online community to play an active role in the process of collecting, reporting, analyzing and disseminating news and information - in short, interacting with, instead of just reading, the news. "

Wednesday, December 14, 2005

Yahoo! Site Explorer

Yahoo! Site Explorer "allows you to explore all the web pages indexed by Yahoo! Search. View the most popular pages from any site, dive into a comprehensive site map, and find pages that link to that site or any page."

Tuesday, December 13, 2005

Lifehack.org on "Simple Steps on Handling Tasks"

Read about them here.

Talk Digger blog search

At Talk Digger you enter the URL of a Web page and it combines the results from nine blog search engines to tell you who is talking about that site.

Text-to-HTML conversion tool for Web writers

"Markdown is a text-to-HTML conversion tool for web writers. Markdown allows you to write using an easy-to-read, easy-to-write plain text format, then convert it to structurally valid XHTML (or HTML)."

Podscope audio and video search feeds

Podscope, an audio and video search engine, now offers RSS feeds that will notify you whenever something you want to monitor is mentioned in a podcast. When you do a search the results include buttons that let you monitor future mentions of your search terms using popular feed readers such as Newsgator, My Yahoo! and the Google Reader. Don't expect flawless results, though. "Just as you have been asked, 'Pardon me?' or 'Could you repeat that?', we too misunderstand some of what we hear," the site's blog says. "We’re working on it though, and we promise it will get better every month."I entered Mitch McConnell's name and found three podcast mentions of him - the newest from three months ago when he was mentioned on The Al Franken Show. The site lets you play the podcast snippet, and it begins with Al Franken's voice:

"Kentucky Senator Mitch McConnell threw a brick through the window of a Circuit City and blank a 60-inch plasma TV."

"Oh, that's so easy," a woman's voice says. "Found."

"Exactly, exactly!"

Bias note: I tried to find equivalent mentions of prominent Kentucky Democratic politicians, but came up with nothing, perhaps because there are so few of them.

Monday, December 12, 2005

Digital clues unmask Wikipedia faker

This morning we carried the wire version of a New York Times story about how the operator of an anti-Wikipedia Web site unmasked the author of the false Wikipedia post on John Seigenthaler Sr. Seigenthaler had been told the only way to identify the anonymous writer was to file a lawsuit against the writer's Internet Service provider, which he decided not to do, writing about his experience in USA Today instead. Daniel Brandt, the operator of Wikipedia Watch and a book indexer from San Antonio, Texas, used a little basic detective work to identify the author as Brian Chase, the 38-year-old manager of a Nashville delivery service, who did it as "a joke." Brandt's main clue was the Internet Protocol address of the anonymous writer's computer, which Seigenthaler had included in his column. Every computer connected to the Internet has a unique IP address, and sometimes, even if that's all you know, it can be used to find which city someone lives in. For example, GeoBytes, offers just such a locator service. (Brandt doesn't say what he used to trace Chase to Nashville, although he did use a free software tool called Curl to learn that the IP address was for a computer server, which gave him the clue that told him which company Chase worked for).

Opinmind blogger search

Opinmind is a blog search engine that divides the search results into bloggers who have either a positive or a negative view of a subject. For example, if you type in the word "journalists," in the search results on the left you'll find bloggers saying positive things like, "we DO need journalists" and "I personally love the journalists that go out on the limb to report our folly" and on the right are bloggers saying negative things like "I hate journalists" and "journalists throw out stupid questions." This is measured by its trademark "sentimeter," which, if I read it right, claims that overall 58 percent of bloggers have a positive view of journalists and 42 percent a negative view. Its crude Manichaean view of the world doesn't appear to take into account that you can - as I do - entertain both positive and negative views of journalists at the same time.

Friday, December 9, 2005

Hurricane vehicle and watercraft fraud database

The National Insurance Crime Bureau offers a database where you can search for vehicles or watercraft affected by Hurricanes Katrina, Rita and Wilma. Don't close the deal on a used car or boat these days without looking here first.

Health care spending by state

You can get state-by-state estimates of health care spending from 1980 to 2000 at the Centers for Medicare & Medicaid Services Web site. This includes spending by hospitals, doctors, clinics, nursing homes as well as spending on prescription and over-the-counter drugs, eyeglasses, hearing aids and other products. But like all data, it has limits, and the site warns "These estimates should not be used to calculate estimates of spending per person in a state."

How the New York Times sees blogs

A top editor at the New York Times gives that newspaper's take on blogs, as reported in L.A. Observed:

"It’s worth spending a little time thinking about blogs, and about ourselves. Blogs make some newspaper people nuts; they’re partisan, the thinking goes, and unfair and mean-spirited and sloppy about facts. Newspapers make some bloggers nuts; they think we’re dull and slow and pompous and jealous guardians of unearned 'authority.'

It’s a pretty dopey argument. Indeed, some blogs are lousy. So are some newspapers. Some blogs reject journalism. Some practice it."

Wednesday, December 7, 2005

Permission-free photos, music and text search

Creative Commons Search "helps you find photos, music, text, and other works whose authors want you to re-use it for some uses -- without having to pay or ask permission."

Worldpress.org: News and opinion from outside the U.S.

Worldpress.org offers "News and Views From Around the World." They reprint articles published in the press outside the U.S. and say they are "committed to offering our readers a first-hand look at the issues and debates that occupy the world's newspapers and magazines."

10 Journalism Tips For Bloggers, Podcasters & Other E-Writers

Even the dinosaurs of the mainstream media could benefit from 10 Journalism Tips For Bloggers, Podcasters & Other E-Writers. "Blogs, podcasts & e-newsletters make it easy for anyone to be a journalist," says the writer, Spencer Critchley. "But just as the debut of desktop publishing led to some very ugly documents, these newer tools are spawning some very sloppy journalism, which does no good for the reputation of participatory media."

Tuesday, December 6, 2005

Washingtonpost.com Congress Votes Database

The Washington Post this week put online a Congress Votes Database that lets you browse congressional votes back to 1991. You can get RSS feeds that will notify you whenever a member makes a vote, and get a feed of the ten most recent votes in Congress. You can learn who the biggest vote missers are, which votes took place late at night, which votes had large or narrow margins, and which position on each bill the majority of Democrats and Republicans took, if available. There are also vote breakdowns by party, state, region, "baby-boomer status," gender, and, for laughs, "by astrological sign." The data was gathered from various Congressional sites and massaged into a more useable form by the Post. "This site is generally updated every day, although there is a delay between a vote in Congress and its appearance on the official Congress Web sites," the site says. The database, which is described as a work in progress, is the creation of Adrian Holovaty, an Internet developer who created the Chicago Crime Web site before becoming the Post's editor of editorial innovations, and Derek Willis, a research database editor at the Post and the proprietor of The Scoop. I don't know about you, but I'll be looking forward to them adding a corruption index...

Merriam-Webster's Open Dictionary

Merriam-Webster has debuted its "Open Dictionary," "where you can 1) submit and share entries that aren't already in our Online Dictionary, and 2) browse entries submitted by other members of the Merriam-Webster Online community." Recent entries include "kempt," "downroar," "phenormous" and "emoticoning." But I wouldn't count on John Seigenthaler Sr. making a contribution anytime soon ...

Monday, December 5, 2005

Food nutrient database

The National Nutrient Database for Standard Reference from the U.S. Department of Agriculture gives you detailed information about the makeup of foods. For example, say you want to know about a jar of "Babyfood, cereal, mixed, with applesauce and bananas." You will learn it is composed of dozens of nutrients, including 49 milligrams of phosphorus.

HousingTracker

HousingTracker offers housing data for U.S. cities. This includes median asking price, housing inventory, and recent trends.

Gallery of Computation

The creator of The Gallery of Computation writes computer programs to make graphic images.

"With an algorithmic goal in mind, I manipulate the work by finely crafting the semantics of each program. Specific results are pursued, although occasionally surprising discoveries are made.

I believe all code is dead unless executing within the computer. For this reason I distribute the source code of my programs in modifiable form to encourage life and spread love."

Like, groovy.

Friday, December 2, 2005

Flightview flight and airport trackers

Flightview gives you the status of any U.S. airplane flight. You can search for flights by flight number, airport, airline or time of flight. It also offers summary information and charts showing the on-time status of all flights at each airport. They also sell historical data back to 2001. "Specify which airports or airfields you want and over what time period and we can tell you everything you need to know about the flights that landed or departed," the site says. "Identify aircraft type, tail or flight number, wheels-up and wheels-down times, time in-air, weight class, distance and many other datapoints." I learned about this site from PI News Link, a blog that offers "articles of note for the investigator."

Thursday, December 1, 2005

AFL-CIO Job Tracker database

The AFL-CIO has an online database called Job Tracker that tells you "which companies in your area are exporting jobs, endangering workers' health or involved in cases of violations of workers' rights under the National Labor Relations Act. The database contains information on more than 60,000 companies nationwide."

USATODAY.com: A false Wikipedia 'biography'

Retired journalist John Seigenthaler Sr. wrote in USA Today this week about how the Wikipedia falsely implicated him in the assassinations of John F. and Bobby Kennedy. It is a cautionary tale for anyone who takes the Wikipedia's words at face value. " ... we live in a universe of new media with phenomenal opportunities for worldwide communications and research — but populated by volunteer vandals with poison-pen intellects," Seigenthaler writes.