Saturday, January 27, 2007

Many Eyes on data






Recently I posted a link to Swivel, a Web site that lets users share, analyze and comment on data. For whatever reason, I never gave Swivel a serious try. Then I read about Many Eyes, a similar site by visualization researchers at IBM that went public this month. One of the makers of Many Eyes helped create the NameVoyager, a fabulous tool for exploring how preferences for names have changed over time. I immediately felt the urge to play with Many Eyes. Whereas Swivel's home page displays a series of uninspiring bar charts, Many Eyes offers a rich variety: not just bar and line charts, but also bubble charts, histograms, network diagrams, treemaps, U.S. and world maps and more. It's exceptionally easy to feed data to Many Eyes: You just cut and paste from an Excel spreadsheet into a Web form, and the site recognizes the format and immediately lets you analyze it visually. Swivel, meanwhile, takes you through the familiar, laborious process of choosing a file from your hard drive (comma-separated values only, please, no Excel!) and stepping through a series of screens. It's easier to create a chart with Many Eyes than it is with Excel itself or other desktop chart tools I've used. Especially impressive is the way you can easily manipulate charts on the fly -- selecting categories of data, and rearranging how the data is displayed on the screen to understand it in new ways. You can grab anyone else's data and manipulate it, subscribe to comments on any charts and post charts to a blog. You can do that with Swivel, too, and Swivel attempts to automatically find correlations between different data sets, something Many Eyes doesn't attempt. Swivel has many good qualities, but at this stage I don't find it as compelling as Many Eyes. Using Many Eyes was -- dare I say it? -- fun.



P.S. You can read about how the Many Eyes creators compare themselves to Swivel on the O'Reilly Radar blog.



P.S.S. As the CEO of Swivel points out in the comments, it is possible to cut and paste data into Swivel, so shame on me for getting that wrong.

Friday, January 26, 2007

Guide to sources of statistics

Looking for statistics on a particular subject? Check out the appendix to the Statistical Abstract of the United States, which provides a summary of sources of statistics. The guide tells you how frequently the statistics are updated, whether they're available on paper or on the Internet and the Web address of the organizations that offer them.

National Traffic and Road Closure Information

National Traffic and Road Closure Information is a federal portal to traffic information. The Kentucky and Indiana pages, for example, link to TRIMARC, the Louisville's traffic monitor. There are also links to other traffic-related sites.

Wednesday, January 24, 2007

Yahoo! school search

Yahoo! has added nationwide school search. From the Yahoo blog:

Buying a house is all about location, and proximity to good schools is one of the big factors in choosing where to live. Today we launched nationwide schools search on Yahoo! Real Estate. Users can search and browse local schools via an interactive map interface, and refine and sort their search by school district, distance, grade level, or school type (e.g. public, private, charter).

We partnered with leading non-profit, GreatSchools.net, for detailed school information, statistics, parent reviews, and links to test scores and related content.

Each school detail page also features a neighborhood map that leverages the Yahoo! Local API to plot nearby grocery stores, parks, restaurants and other local businesses with user ratings and reviews to help users get a better feel for the neighborhood.


Tuesday, January 23, 2007

MuniNetGuide

MuniNetGuide is "An online guide and directory to web sites for state, county and local governments and other municipal related topics."

Our unique emphasis on municipal bonds, state and local government, and public finance allows us to focus on web sites of interest to professionals with an interest in government administration, municipal investment, research, and urban affairs.



While the Internet has made it possible to access more news and information than even before, it has also made us vulnerable to information overload. That's where we enter. We find the best and newest municipal web sites and features, and point you in their direction. MuniNet Guide focuses on research, news, trends and perspectives to eliminate "random roaming" – one of today's biggest time wasters.

Monday, January 22, 2007

Digital History Hacks blog

I learned about How to Read a Book, mentioned previously, on Digital History Hacks, a blog about applying a geek's tools to history. The author is William J. Turkel, an assistant professor at the University of Western Ontario. If journalism truly is the first rough draft of history, then journalists have something to learn from historians. Posts I liked included "Teaching Young Historians to Search, Spider and Scrape" and "On N-gram Data and Automated Plagiarism Checking." And check out his recently posted "Readings for a Field in Digital History."

How to Read a Book

How to Read a Book (PDF) by Paul Edwards, an associate professor of information at the University of Michigan, is a good, short primer on extracting the maximum content from a book in the minimum amount of time.

EPA's Window to My Environment

The Environmental Protection Agency's Window to My Environment maps federal, state and local environmental features, including the "location of regulated facilities, monitoring sites, water bodies, population density, perspective topographic views," "geographic statistics about your area of interest, including estimated population, county/urban area designations, local watersheds/waterbodies," and "links to information from federal, state, and local partners on environmental issues like air and water quality, watershed health, Superfund sites, fish advisories, impaired waters, as well as local services working to protect the environment in your area." Search by ZIP code or town and state.

Friday, January 19, 2007

Civil Rights Litigation Clearinghouse

The Civil Rights Litigation Clearinghouse from Washington University Law School "is a collection of documents and information about civil rights cases in selected case categories across the United States. Currently, the categories include: Child Welfare, Election/Voting Rights, Immigration, Jail Conditions, Juvenile Institution, Mental Health Facility, Mental Retardation Facility, Nursing Home Conditions, Police Non-Profiling, Police Profiling, Prison Conditions, Public Housing, School Desegregation."

Thursday, January 18, 2007

BRB public records blog

BRB Publications, which publishes books on researching public records, now has a blog devoted to the subject.

Gallup Guru blog

The USA Today's Gallup Guru is a blog on polling by Frank Newport, the editor in chief of the Gallup Poll and the author of Polling Matters.

Louisville media critics online

Clearly old media must still have some juice. Why else would these Web sites exist to set us straight?

  • The Ville Voice is the place where "Louisville's Media Critic Takes on the News."
  • The Louisville Media Reform Community "is a local, nonpartisan network of people working to open the print and broadcast media establishment to citizen participation in order to ensure that diverse voices are heard and the public interest is served."

Wednesday, January 17, 2007

New features on GovTrack

GovTrack.US, an excellent Congressional monitoring Web site created by a graduate linguistics student in his spare time, has added several new features, including "a recent voting history, an ideology meter, and pages for senators show their constituent approval rating". The one I find most intriguing is that it now displays the full text of bills highlighting "the changes made to the text during its legislative history, such as due to amendments and substitutions." Theoretically this should make it easier to spot those subtle changes in bills that can radically alter their meaning and purpose.

Monday, January 15, 2007

The Hype Machine: Music blog and concert search

"The Hype Machine keeps track of songs and discussion posted on the best music blogs," the site boasts. "Easily listen, discover and buy songs that everyone is talking about!" My first thought reading this: Is this hype?

Digital document sharing, sort of

Footnote is a Web site for sharing digitized documents.

At Footnote.com you will find millions of images of original source documents, many of which have never been available online before.



But at Footnote, finding an image is just the beginning.



We have created powerful tools that let you interact with and enhance what you find. Annotate important information on the image, easily organize and share your findings or collaborate with people who have similar interests.



It recently signed an agreement with the National Archives to host 4.5 million documents, including the Papers of the Continental Congress, Constitutional Convention Records, photographs by Matthew Brady, records of the Southern Claims Commission, the Civil War Pension Index and 1908-1922 investigative files of the Bureau of Investigation, the precursor to the FBI.



Addendum: A reader points out that to actually view the documents and photographs, you have to pay $99 a year or $9.99 a month for a subscription, or $1.99 to view a single one. That speaks poorly for both the site, for not making this explicit upfront, and for the National Archives, for selling access to our heritage, which should be free to all, always. The documents will be available to all taxpayers only later, according to a press release announcing the archives agreement:


By February 6, the digitized materials will also be available at no charge in National Archives research rooms in Washington D.C. and regional facilities across the country. After an interval of five years, all images digitized through this agreement will be available at no charge through the National Archives website.


Why five years, given that many federal government documents and photographs are already online for free? Clearly I should have read the footnotes.

Friday, January 12, 2007

cRANKy: Search for the 50-plus crowd

cRANKy says it is "The first age-relevant search engine." The company that created it, Eons, explains:

Are you overwhelmed by too many results when you try to search the Web to find what you’re really looking for? Eons has developed cRANKy, a specialized, age-relevant Web search engine for 50+, which starts you off with the four best Web results we can find, rated and ranked by both the cRANKy editorial team and our members. The more our members participate in rating and reviewing, the better cRANKy gets. If you’re a registered member, cRANKy will save your search history so you can easily navigate back to the Web sites you’ve visited through cRANKy. We will prompt you on your personalized homepage as well as within cRANKy each time you visit a site to remind you to let us know how you like it. Web sites reviewed by the cRANKy editorial team include rich descriptions of the site, tips, and deep links to help you navigate directly to key information. Results marked with the Eons symbol help you locate content on the Eons site related to your search.

With a name like cRANKy, clearly they don't think the more senior among us need to be flattered.

Breaking news on food & beverages

FoodNavigator-usa.com offers "Breaking News on Food & Beverage Development" Bet you didn't know this: "Easterly emerging markets have continued to stop the rot for regular carbonated soft drinks as western consumers increasingly turn to diet versions and water, a new report says."

Frequently stolen cars

The National Insurance Crime Bureau offers state-by-state listings of the most frequently stolen cares. It doesn't bode well for Kentucky that its most frequently stolen vehicle is the 1986 Oldsmobile Cutlass.

Thursday, January 11, 2007

Specially Designated Nationals and Blocked Persons

The Office of Foreign Assets Control maintains a list of "Specially Designated Nationals and Blocked Persons." "The Office of Foreign Assets Control (or OFAC) is an office of the United States Department of the Treasury that administers and enforces economic and trade sanctions based on U.S. foreign policy and national security goals against targeted foreign countries, terrorists, international narcotics traffickers, and those engaged in activities related to the unapproved proliferation of weapons of mass destruction," Wikipedia explains. You can get the information in multiple formats, including formats suitable for importing into a database. You can also search using the federal government's Excluded Parties List System, and there are free search tools on the Web, including one offered by InstantOFAC and another by NASD. The Wikipedia notes that blocked parties include Al Qaeda, communist parties of several countries and The Central Bank of Iran:

Also included are various names of individuals whose names are blocked, though controversially this has caused some innocent individuals to be unable to perform daily transactions such as wiring money to relatives or opening bank accounts, even though OFAC has a verification hotline to verify that a blocked individual is, in fact, a positive match to the list or not.

A Periodic Table of Visualization Methods

A Periodic Table of Visualization Methods is a unique way to show the many different ways you can visualize ideas and data. You hover your mouse over an "element" and it gives you an example of that particular visualization method. The Web site itself offers a course, not open to to all, to develop visual literacy skills. Andrew C. Thomas of The Social Science Statistics blog says the periodic table visualization is "not at all useful, but the depth and organization sure is."

Searchbots

Searchbots lets you build your own "search robot."

A Searchbot is your own personal search robot that continuously searches the Internet trying to find all the best websites it can on your behalf. When you build a Searchbot you give it a personality and then program it's search circuits with all the things you want to find. You can search for websites based on factual information like tags and locations, or more creative ways like colour and the mood you're in. You can even ask your Searchbot a question and it will talk to other Searchbots to find you an answer...



Searchbots.net is an experimental search engine that investigates the use of mythology, personification and game theory as motivational strategies in creating a sustainable search community. Searchbots has a rich history and is unique in that it allows you to search using more "human" and entertaining types of information like colour and mood. If you picked the colour red you might get a website about tomatoes, communism or angry people.



This was too cute for me. You have to go through the tedious process of assembling your robot and giving it a personality before you can send it out in the world. Just give me a search box, please.

Wednesday, January 10, 2007

"Fuzzy matching" ZIP code finder and address verifier

QAS offers a free address lookup tool that does "fuzzy matching" to verify addresses, including returning the full ZIP code, filling out partial addresses and correcting misspellings. You get ten free lookups a day and they sell a "pro" version you can add to your Web site.

Insider trading monitoring

SECForm4 is "is a real-time insider trading data monitoring service." It provides lots of different ways to look at stock trading by company insiders and lets you view copies of the actual Securities and Exchange Commission filings as well.

Free intranet search from IBM and Yahoo!

Given the names involved, this looks like a promising way to add good search to your intranet: "IBM OmniFind Yahoo! Edition is a no-charge, entry-level enterprise search software solution that enables rapid deployment of intranet and file system search for both employees and customers." You can customize it and make it look like your site. Better yet, you don't have to pay for an expensive search appliance, as with Google. There's a Flash video explaining it.

Tuesday, January 9, 2007

Documenting our work

Tom Johnson of the Institute for Analytic Journalism makes a good point about how we can prevent errors like the one The New York Times made when writing about a case in El Salvador :

Most journalists today write on word-processing programs that have the capability to insert comments, footnotes or endnotes anywhere in a story.


Editors should require that reporters provide one of these insertions after every paragraph, stating how they know what they just wrote. Such notes could include U.R.L.'s, references to specific pages in a reporter's notebook, reference documents (for example, the court transcript you referred to) — anything that would allow the reporter and editors to backtrack from the manuscript and judge the veracity of the content or conclusions.


I would encourage making as much of this type of sourcing as possible available to online readers so they can judge for themselves, especially when public sites and documents are referenced.

Monday, January 8, 2007

Podzinger offers YouTube audio/video search

Podzinger, the audio/video search engine, now offers the option to search the contents of YouTube videos. I tested it on a video I knew existed, of the CJ's James Bruggers being interviewed at an environmental journalism conference, but Podzinger didn't find it when I entered his name or when I entered a phrase I knew was in the video ("freedom of information"). A search on Louisville, meanwhile, turns up a man trying to stuff a cat into a blender and an old Denny Crum video. I leave it to you to decide how worthwhile that is.

Glossary of architectural terms

Archiseek

Free OCR

GOCR is free, open source optical character recognition software for converting images of text to text.


Friday, January 5, 2007

Emerging Infectious Diseases

As if you didn't already have enough to worry about: You can read the Centers for Disease Control and Prevention's magazine, Emerging Infectious Diseases, online. One of the "expedited" articles on the site today is a letter on "Pandemic Influenza School Closure Policies." The letter, by a member of the research staff at the Program on Science and Global Security at Princeton University, says:

Although the United States is a nation dedicated to federalism, an uncoordinated approach for community response measures such as school closure decisions could jeopardize our efforts in containing a deadly pandemic. If schools were to remain open until a certain percentage of students and faculty became ill, as they do during typical influenza seasons, then control measures to contain the outbreaks would likely be far more difficult to achieve because a chain of transmission would be established.



Metrics 2.0: Market intelligence

A while ago a reader suggested I check out Metrics 2.0, whose tagline is "The Big Picture By Numbers." I took a glance at it and thought it looked so information-rich that I needed to take some time exploring it before writing about it. So I put it aside and promptly ... well, forgot about it. So my apologies to that reader. I was reminded of it again while catching up on blog posts accumulated over the holidays: Robert Berkman, the editor of The Information Advisor, declares Metrics 2.0 "one of the best free market intelligence sites I've seen in a long time. ... it is just packed with timely, substantive hard market research data culled from top notch authoritative news, trade, and research sources." Berkman also notes that the "site is a bit mysterious as there is no information on its founder or who is behind it." That's no small thing when a site is asking you to trust it as an information source, but Berkman promises to check it out and report back. In the meantime, here's how the site describes itself:

Metrics 2.0 is the most comprehensive source for succinct data-driven market intelligence and insights with the most relevant stats, facts, figures, forecasts, and charts on global market trends sweeping a wide range of areas including economy, business, financial markets, technology, internet, Web 2.0 and the interplay.



Metrics 2.0 is the preferred destination for leading business executives, investment professionals, market analysts, technology leaders, and entrepreneurs from Global 2000 organizations as well as budding entrepreneurial organizations looking for a quick and relevant data that impacts decision making.



Metrics 2.0 aggregates and synthesizes data on key market trends, business performance metrics, global economic indicators, financial markets, and technology trends from hundreds of popular and highly regarded sources including government agencies, trade organizations, market research firms, investment research analysts, consulting & advisory firms, company sources, and other major data providers.

Tuesday, January 2, 2007

Free ZIP code database with latitudes and longitudes

... offered by About.com in Microsoft Access format.

Zamzar: free file conversion

"Have you ever wanted to convert files without the need to download software ?" Zamzar asks. Have I ever. This free site will convert a PDF file to text file, a comma-separated file into an Access database and a .wav music file into an MP3, among many variations. You can convert a file up to 100 megabytes. I fed it a complicated academic paper with multiple columns, graphics and mathematical formulas in PDF and Zamzar converted it to a Microsoft Word file, preserving most of the document formatting. It failed only on some of the mathematical formulas, such as being unable to recreate a Greek sigma. You upload the file and Zamzar emails you back a link to download the converted version, which expires after a day. I'm skeptical they can continue to offer such a bandwidth-hungry service for free, but they say they're supported by "advertising tie-ins."

FedBizOpps.gov

FedBizOpps.gov publicizes opportunities to do business with the federal government. These are bid solicitations worth more than $25,000. A search for Kentucky turned up Ft. Knox seeking bids for food service, the General Services Administration looking for leased office space in Hazard and the U.S. Army Corps of Engineers asking for bids on paving at Lake Barkley. It could be a way to find a story or two.