Tuesday, January 31, 2006

Test answers question, "What kind of dog are you?"

This Web site, which promotes the movie Gone to the Dogs, offers a free test to find find out what kind of dog you are:

There's a dog inside all of us, waiting to be let out. This game is based on a computer called SUKA, built in 1975 by Russian scientist Mikhail Volkonsky and now housed in a London museum.

SUKA is powered by CATS (Canine Algorithmic Transfer System™) which is able to determine what kind of dog you are. Simply answer 10 questions, being as honest and accurate as possible and CATS will calculate which breed you resemble the most.

For the record, I'm an Afghan Hound.

State of the Union Address database

Database maker askSam has found a good publicity generator by using its software to serve free information on the Web. Now they're offering a searchable database of State of the Union Addresses of the American Presidents going back to 1790.

Monday, January 30, 2006

New Biographical Directory of the United States

The 2,236-page Biographical Directory of the United States "provides the biographies of thousands of Members of the Senate and House who have served from the first through the 108th Congress," the GPO said in an email announcing the release of the 16th edition. "The most recent edition of the Biographical Directory details the changing face of Congress and lists Members by their full name, and for the first time nicknames or initials. Also listed are territorial delegates, resident commissioners, and vice presidents."

Women's Audio Visuals

Women's Audio Visuals In English, or WAVE, "is a database maintained by the University of Wisconsin System Women's Studies Librarian's Office that lists documentary, experimental, and feature film and video productions by and about women."

Friday, January 27, 2006

Encyclopedia of mythology, folklore and religion

Encyclopedia Mythica is an internet encyclopedia of mythology, folklore and religion.

Census International Data Base

The U.S. Census Bureau's International Data Base is "a computerized data bank containing statistical tables of demographic and socioeconomic data for 227 countries and areas of the world."

Nugget of the day: Federal prosecutions for public corruption

Average number of federal officials convicted per year of public corruption since 1985: 489

Source:

Free fonts

This page offers "A huge list of freeware fonts." There are so many they added a second page.

Thursday, January 26, 2006

Nugget of the day: Percent of workers who are union members

Kentucky

1983: 17.9
2004: 9.6

Indiana

1983: 24.9
2004: 11.4

United States

1983: 20.1
2004: 12.5

Source: Union Membership and Coverage Database

A Glossary of Hardboiled Slang

Twists, Slugs and Roscoes is "A Glossary of Harboiled Slang." "This is the language spoken by Philip Marlowe, Sam Spade, Mike Hammer and the Continental Op," the site explains. "When Cagney, Bogart, Robinson and Raft got in a turf war, this is how they talked." Here's a quote from the home page, so you can get the full flavor of why you might need something like this: "You dumb mug, get your mitts off the marbles before I stuff that mud-pipe down your mush--and tell your moll to hand over the mazuma." Bonus points to anyone who can translate that without looking at the glossary first.

Wednesday, January 25, 2006

Email strengthens social ties, Pew report says

Email strengthens your social ties - not weakens them, says a new report by the Pew Internet & American Life Project:
"Our evidence calls into question fears that social relationships — and community — are fading away in America. Instead of disappearing, people’s communities are transforming: The traditional human orientation to neighborhood- and village-based groups is moving towards communities that are oriented around geographically dispersed social networks. People communicate and maneuver in these networks rather than being bound up in one solidary community."

The Curbside Investigator podcast

The Curbside Investigator is a private eye who wants to explain himself in a podcast:
"The Curbside Investigator is my way of helping the public understand not only what Private Investigators can do, but what they can’t do as well. I’ve found that generally, investigators have an air of mystery about them tinged with a fair amount of suspicion. I’m going to try to dispel some of the myths about investigations, explain what we do, and hopefully introduce everyone to some true professionals. Along the way, I hope to educate everyone on when and how to use an investigator effectively should you ever need one."
I learned of this from PI News Link, which I've mentioned before. It often has investigative tips and commentary on public record issues of interest to reporters and other researchers.

The Missing Amendment

The Missing Amendment is a Web site whose mission is "to inform, to foster action, and to provide a collective voice" for privacy advocates.

Tuesday, January 24, 2006

New, improved FirstGov.gov search

SearchEngineWatch reports that FirstGov.gov, the federal government's Web portal, has a new, improved search page. Its improvements mirror the workings of the search engine Clusty, whose owner recently won the contract to redo the federal site's search.

Legal Authorities Supporting The Activities Of The National Security Agency Described By The President

The Government Printing Office has put online (PDF) President Bush's justification for letting the National Security Agency tap Americans' communications overseas.

Spamming Sundance winners

Employees of a company that sells anti-spam software and services has used the same statistical techniques used to filter junk email to predict the winners of the Sundance Film Festival. An LA Times story quoted the company's chief executive:
"We had the last 10 years of the festival's film guides, which are like inputs, and then a bunch of outputs, like how many people saw a film, did it win anything at Sundance, did it have commercial success. If you could figure out the pattern between the inputs and the outputs, then you could actually predict future winners."
The Web site for their project is here. I'd say they've learned something else from Hollywood: The value of a publicity stunt.

Nugget of the day: Car making

Today I begin an occasional feature in which I offer a nugget of information on a newsworthy -- and sometimes not-so-newsworthy -- subject. I see it as another way I can educate both you and me about the wealth of information sources that are out there.

I'll start today with a statistic about car making, since that's a subject much in the news today.

U.S. share of global passenger car manufacturing
1994: 19%
2004: 10%
Source: Bureau of Transportation Statistics, National Transportation Statistics 2005, http://www.bts.gov/publications/national_transportation_statistics/2005/index.html

How to sanitize Word documents

Helpful advice from your friendly National Security Agency - "Redacting with Confidence: How to Safely Publish Sanitized Reports Converted from Word to PDF":

Both the Microsoft Word document format (MS Word) and Adobe Portable Document (PDF) are complex, sophisticated computer data formats. They can contain many kinds of information such as text, graphics, tables, images, meta-data, and more all mixed together. The complexity makes them potential vehicles for exposing information unintentionally, especially when downgrading or sanitizing classified materials.

Monday, January 23, 2006

GovExec.com: The Perils of Political E-Mails

GovExec.com discusses The Perils of Political E-Mails. "There's a fine line between actions that fall within a civil servant's right to express an opinion on a political candidate or issue, and those that could be found to violate a law prohibiting political activity while on duty or in a government office," writes Daniel Pulliam.

Radical newspaper surgery

If you have a particular newspaper sacred cow, and even if you don't, you'll want to read this posting at the BuzzMachine. The author suggests lots of radical surgery for the dead-tree edition, including dumping local critics, stock listings, tv tables, comics, sports columnists, bridge columns, Washington bureaus and more. The comments on this are also worth a read.

Sunday, January 22, 2006

Free textbook revolution

Textbook Revolution is a Web site devoted to publicizing free educational materials. "In response to the textbook industry’s constant drive to maximize profits instead of educational value, I have started this collection of the existing free textbooks and educational tools available online," the site says.

Saturday, January 21, 2006

A universal standard for citing quantitative data

Two Harvard academics propose a "universal standard for citing quantitative data" in scholarly papers. They say such a standard doesn't exist now:

Practices vary from field to field, archive to archive, and often from article to article. The data cited may no longer exist, may not be available publicly, or may have never been held by anyone but the investigator. Data listed as available from the author are unlikely to be available for long and will not be available after the author retires or dies. Sometimes URLs are given, but they often do not persist. In recent years, a major archive renumbered all its acquisitions, rendering all citations to data it held invalid; identical data was distributed in different archives with different identifiers; data sets have been expanded or corrected and the old data, on which prior literature is based, was destroyed or renumbered and so is inaccessible; and modified versions of data are routinely distributed under the same name, without any standard for versioning. Copyeditors have no fixed rules, and often no rules whatsoever. Data are sometimes listed in the bibliography, sometimes in the text, sometimes not at all, and rarely with enough information to guarantee future access to the identical data set. Replicating published tables and figures even without having to rerun the original experiment, is often difficult or impossible.

Can Data Ever Be Deleted?

An article at Internetnews.com asks "Can Data Ever Be Deleted?" It says companies are "being overwhelmed by a tidal wave of compliance-related storage demands":

What data should you get rid of? The simple answer is anything that does not have any legal ramifications if you remove it. The sad thing is that nobody knows what that is. Any time you think about removing something, it's impossible to comprehend whether that data could become important in the future.

It can even be argued that e-mail about mundane matters such as the annual picnic should always be retained. If someone is injured at the event, for example, OSHA may be all over it and workers' comp might feel the company has to pay. If that person then dies, law enforcement will want to see that e-mail and lawyers will be subpoenaing every record under the sun, moon and stars.

Batch geocoding

Matt Waite points to an online tool that makes use of Yahoo's map hooks to batch geocode addresses. You just cut and paste a tab-delimited list of addresses into Batchgeocode's form and it returns the appropriate latitudes and longitudes, which you can then plot on a map. It even plots the first 100 on a map for you.

Maybe you shouldn't be tested for cancer

The wiki Chance News writes about the issues raised in the book, "Should I Be Tested for Cancer? Maybe Not And Here's Why." The author of the book, Dartmouth medical doctor H. Gilbert Welch, contends "that screening for cancer is inefficient in that very few people who actually have the particular cancer are both discovered and then cured," according to the Chance article. "Moreover, the false positives result in many problems of which the general public is not aware."

Article on finding Web sites you can trust

Google's Newsletter for Librarians offers an article, "Beyond Algorithms: A Librarian's Guide to Finding Web Sites You Can Trust." The author, Karen G. Schneider, is the director of the Librarians' Internet Index, one of the best research resources on the net. "Whether we're selecting new web sites for our newsletter or deciding whether to toss or keep sites already in our collection, we rely primarily on what we call the 'big five show-stoppers': availability, credibility, authorship, external links and legality," she writes.

Friday, January 20, 2006

Visualizing baseball decisions with pie charts

It's not baseball season, and even in the height of the summer I hesitate to post something about this dying, corrupt sport, but I thought this was such an interesting data visualization I had to pass it along. The Baseball Visualization Tool "is built around the basic idea that a pie chart can represent a simple yes or no decision" - in this case, whether a pitcher should be pulled from the game. You input all the factors that go into that decision and the tool, in theory at least, helps you make a better choice. It's a demonstration by a company that makes data visualization products. Personally, I found inputting the factors and understanding what the tool was trying to tell me so daunting, I think I'll just go with my gut.

Washingtonpost.com shuts off blog comments

The Washington Post has temporarily shut off reader comments on one of its blog. It follows the vitriol that spewed forth when their ombudsman, Deborah Howell, said lobbyist Jack Abramoff gave contributions to both Democrats and Republicans. She later acknowledged it would have been better if she had said Abramoff "directed" contributions to both parties, since he personally didn't contribute to the minority party. Washingtonpost.com's editor said a "significant number" or readers didn't follow rules banning "personal attacks, the use of profanity and hate speech."

Thursday, January 19, 2006

Global data on HIV/ADS, TB, Malaria, and more

Globalhealthfacts.org by the Kaiser Family Foundation offers worldwide data on HIV/AIDS, TB, Malaria and other diseases, including not only the incidence of those diseases but also information on programs, funding, demographics and the economy of each country.

Glossary of commonly used legal terms

The federal courts offers its own online glossary of commonly used legal terms.

Web site monitor reviews

Here's a review of services and software that monitor Web pages and notify you when they've changed. I learned of it from librarian Gary Price at SearchEngineWatch, who offers his own take.

Tuesday, January 17, 2006

Deflating a memoir with the Mortality Detail File

Economist Steven D. Levitt, the author of Freakonomics, uses the CDC's "Mortality Detail File" to cast further doubt on the veracity James Frey's memoir, A Million Little Pieces. Levitt compared the description of a suicide in the book with the data, which gives details on virtually every death in the U.S. from 1968 to 1997.

Detecting health care fraud with network analysis

Jong-Sung You of the Social Science Statistics Blog highlights an example of how network analysis was used to detect health care fraud. The example came from a 1990 book by Malcolm Sparrow, License to Steal: How Fraud Bleeds America's Health Care System. "Professor Sparrow suggests that many ideas and concepts from network analysis can be useful in developing fraud-detection tools, in particular for monitoring organized and collusive multiparty frauds and conspiracies," You writes.

FDIC's Bank Find

The FDIC's Bank Find can give you a wealth of information about banks and is searchable by name, address, city, state or ZIP code. Get an overall picture of the banking industry, financial information about particular banks, branch and deposit market share, create custom reports and much more.

Monday, January 16, 2006

Bestseller Lists 1950-1995

Cader Books has put online Publisher Weekly's list of bestselling hardcover books for each decade of the last century. There's something comforting in knowing that most of the bestsellers then are just as forgettable as most of the bestsellers now.

Chat with people viewing the same Web page

Chatsum "is a FREE add-on for your web browser that lets you chat with all the other Chatsum users that are looking at the same website as you." Works only with Firefox, and, soon, it says, Mac OS X Tiger. Haven't tried it myself, though.

Database of science and math songs

The MASSIVE database "contains information on over 2000 science and math songs. Some of these songs are suitable for 2nd graders; others might only appeal to tenured professors," the site says.

Pobb eHome

Pobb "is an online Tool to create your own unique eHome, with links to your favourite places on the Net and lots more." Everybody wants to be my home page, but I'm not lettin' 'em, including these guys.

Friday, January 13, 2006

Newsplorer

Newsplorer is a free, "fully skinnable news reader application." Essentially it collects your news feeds, groups them by category, and makes them available on your desktop in a "non-instrusive" way.

Museum of Yo-Yo History

I will resist the temptation to make a smart-ass remark about this because it speaks for itself: The Museum of Yo-Yo History. "You have found the most comprehensive archive of yo-yo images, historical memorabilia, and information in the world," the site says.

50 Best Firefox Extensions for Power Surfing

If you're a Firefox user, you'll want to check out the 50 Best Firefox Extensions for Power Surfing. The ability to enchance Firefox with various extensions is the reason I use it instead of Internet Explorer as my default browser. My personal favorites from the list: That's not to say the other extensions on the list aren't valuable, I just haven't tried them or don't have a use for them.

Searchable database of Alito's confirmation hearing testimony

askSam, the makers of a freeform database of the same name, has put searchable transcripts of Judge Samuel Alito's confirmation hearings online. There's also a searchable database of his published opinions.

Thursday, January 12, 2006

2005 Dubious Data Awards

The Statistical Assessment Service (STATS) at George Mason University gave out "dubious data awards" last month it says honor 2005's "Biggest Science Reporting Flubs." Number one is "meth mania," and yes, we've written about meth too.

LLRX.com: Bibliography on “Sensitive But Unclassified” government information

LLRX.com offers "A Selected Bibliography on 'Sensitive But Unclassified' and Similarly Designated Information Held by the Federal Government." "Information professionals should be concerned about the number of categories of protected unclassified information, the vagueness of agencies’ definitions of the various protected categories, and the sheer volume of information that could potentially be deemed non-disclosable," says the bibliographer, Sara E. Kelley, a reference librarian at Georgetown University Law Library. She cites a 1994 report that "estimated that as much as 75% of all information held by the federal government might be considered sensitive but unclassified."

Share calendars with Mosuki

Mosuki is a Web site where you can share calendars and events. "Every event you create can be shared with a couple of friends, just your family, the whole world, or absolutely no one at all," the site says.

Penn State Center for Plasticulture

This is a sign that academic specialization has gone too far: Penn State has a Center for Plasticulture. Plasticulture is the use of plastic in agriculture.

What Coal Miners Do

The United Mine Workers of America explains What Coal Miners Do.

Tuesday, January 10, 2006

Godchecker.com

Godchecker.com offers an online encyclopedia with "over 2,850 deities." It can't be very good, though, because I'm not listed....

Make an online spreadsheet by cutting and pasting

The Jotspot Tracker lets you copy and paste from an Excel spreadsheet and instantly turn it into an interactive Web page. You can create two "trackers," as they're called, and share them with up to five users, for free. After that it's $9.95 a month. I tried it with some Census data, and it worked as advertised. Could be useful if you wanted to share a spreadsheet with reporters in the field …

Free scientific, technical and medical article search

You can now search Article Finder, which the owner touts as "the most powerful database of scientific, technical, medical, and other scholarly content," for free. You can "search through 26 million citations and 8.5 million abstracts from over 54,000 journals," the site says. But you must pay to get full-text copies of articles. Their press release mentions Google Scholar, which is a clue why they're doing this.

Games for the Brain

Games for the Brain has gotta be better for you than the steady diet of Neopets, Spiderman and Nick.com my kids consume. The site will challenge adults too.

2006 Statistical Abstract of the United States

The latest version of the Statistical Abstract of the United States is out and you can get it online for free. But why they put only hard-to-use PDFs online and don't make it easier to search on the Web is beyond me. Maybe they just want to push sales of the print version ... I always buy one, and it's usually the first place I turn when I need an obscure statistic.

Monday, January 9, 2006

Movie Keywords Analyzer

The Internet Movie Database now has something called the "Movie Keywords Analyzer", or MoKA. MoKA "lets you find titles that have a particular keyword and then presents a tally of all keywords from the titles that matched your initial keyword set," the site says. For example, say you're interested in movies about politics. Enter politics as your keyword and it will take you to this page. Then say you're specifically interested in films involving politics and murder. Those keywords will take you to this page. And maybe you're only interested in films involving politics, murder and corruption. Those will take you to this page. And so on, and on, and on ...

Compare lawyer and law firm credentials

LexisNexis has improved its lawyer locator by adding the ability to compare the credentials of lawyers and law firms. A press release says that you can now "compare up to four law firms or lawyers based upon specific decision support criteria, such as areas of practice, size of firm, office locations, educational background and attorney bar admissions."

Sunday, January 8, 2006

Should we have domestic violence offender registries?

The Quad-City Times reports that "A northern Illinois lawmaker wants to create a database for tracking people convicted of domestic violence." The story says the proposal for the database, which would be similar to a sex offender registry, came "from a constituent who unknowingly married a person with a history of domestic violence."

Sorting a text selection in Word

Word Tips explains how to sort selected text in Microsoft Word. It explains how you can sort words, numbers or dates in ascending or descending order.

61 million Social Security Death Index records for $6

A software company is selling the Social Security Death Index, with 61 million names, for $6 shipping and handling until February 28. The Social Security Death Index lists all Social Security recipients who have died and is helpful if you're trying to figure out if a news subject is alive or dead. It's also much loved by genealogists. This version, however, only includes people who died up to June 1999. There are also free versions on the Web, such as this one, which boasts more than 76 million records and says it was updated just last month.

SPSS Wiki

SPSS now has an unofficial Wiki. A Wiki is a Web page where anyone can edit the content. "SPSS Wiki is intended to be a reference and workbook for SPSS statistical procedures," the site says. "It is for both novice and expert." The site says the founder is a lecturer in the Department of Psychology at the University of Southern Queensland in Australia.

Historical Census population and housing data

The Census Bureau keeps selected historical Census population and housing data, including data that goes as far back as 1790, here

Friday, January 6, 2006

The Aargh Page

The Aargh Page uses a spreadsheet-like grid and shaded cells to demonstrate graphically the many different ways people spell aargh on the Web. But doesn't the fact that someone would spend this much mental energy on the subject make you want to say,
"AAAAAAAAAAAAAAAAARRRRRRRRRRRRRRRGH"?
The creator explains himself here.

Computer security, crime and forensic podcasts

CyberSpeak is a "Technology Podcast with coverage of Computer Security, Computer Crime and Computer Forensics topics."

100 Oldest Classified Documents at NARA

The Black Vault, which calls itself the "largest online military and government research center," has posted online documents it says lists the "100 oldest classified documents" at the National Archives and Records Administration.

Kentucky Fire Wire

The Kentucky Fire Wire says its goal is "to act as an avenue for communication throughout the Kentucky fire services. ... It's for Kentucky firefighters, by Kentucky firefighters." There's news, forums, pictures, classified ads, links to fire departments and more. There's also a database of Kentucky fire departments that includes addresses, phone and fax numbers, email addresses, membership counts, radio frequencies and more, although not all information is included for all departments. The site says it is "owned and operated with pride by a Jefferson County, KY firefighter/EMT who enjoys being a unique and important source of information for fellow firefighters."

Thursday, January 5, 2006

KentuckyVotes.org

KentuckyVotes.org says it "gives users instant access to concise, plain language and objective descriptions of every single bill, amendment, and vote that takes place in the Kentucky legislature":
"Unlike any other bill tracking utility, KentuckyVotes.org, is unique because all legislative actions are described - not just those selected by a particular interest group. It is searchable by legislator, keyword, and dozens of subject categories, so users can create their own custom 'voting record guide.'"
The blog Kentucky Progress wrote yesterday that it was "striking" to watch the KentuckyVote's staff "hold court in the Capitol cafeteria." "I saw legislators come by and explain their actions and ask questions about the functions of what is essentially the new sheriff in town," the blog said.

"Finding Subversives with Amazon Wishlists"

Applefritter.com offers a tutorial on "Finding Subversives with Amazon Wishlists":
"It used to be you had to get a warrant to monitor a person or a group of people. Today, it is increasingly easy to monitor ideas. And then track them back to people. Most of us don't have access to the databases, software, or computing power of the NSA, FBI, and other government agencies. But an individual with access to the internet can still develop a fairly sophisticated profile of hundreds of thousands of U.S. citizens using free and publicly available resources."
Using free software tools such as wget and basic programming skills the writer shows how anyone can pinpoint exactly who's interested in reading "dangerous" texts such as "Slaughterhouse-Five," 1984" and anything by Michael Moore or Rush Limbaugh.

UJIKO.com: Evolutionary search

UJIKO is a slick-looking search engine that visually categorizes your results, on the theory it makes it easier to find what you're looking for. The site says UJIKO "evolves with your expertise: The more you use it, the more functions it is able to offer. ... UJIKO's interface is designed to instantaneously help you understand its basic and advanced features set. For the first time, you are using an evolving search Internet player. It is about an object in 3 dimensions which evolves/moves, changes, becoming more and more powerful as you gain more expertise."


Report: How Women and Men Use the Internet

A report (PDF) by the Pew Internet American Life Project says "Men like the internet for the experiences it offers, while women like it for the human connections it promotes."

Wednesday, January 4, 2006

Tuesday, January 3, 2006

Yahoo! offers "Open Shortcuts"

At Yahoo! you can now create your own search shortcuts that will allow you to use another site's search function from Yahoo! You can borrow ready-made shortcuts or create your own. For example, if you wanted to search the Wikipedia, you can type "!wiki computer-assisted reporting" (without the quotes) in Yahoo!'s search box and it will take you directly to the computer-assisted reporting entry in the Wikipedia. Or you can type "!amazon investigative reporter's handbook" to look up that book at Amazon.

PODZINGER podcast search engine

PODZINGER is another podcast search engine. If you get a match, you can listen to the entire podcast or the part of the broadcast where your search terms were mentioned, download the podcast, or subscribe to the podcast via Yahoo! or iTunes. You can also get subscribe to an RSS feed telling you whenever there are new podcasts matching your search terms.

Free ways to share large files

Here's a list of free sites that let you share large files across the Internet. If you've ever sent an attachment too big for your email system, you'll understand the need for these.