Friday, August 31, 2007

Would you take someone's scalp to the coroner?

I think we've taken ethical hand-wringing way too far when we have to debate whether it's journalistically sound to deliver a dead man's scalp to the coroner at the request of the deceased's distraught friend:

The ethical conundrum was two-fold: Should a reporter accept proffered body parts? And, if a reporter does accept said body parts, has he become so tied up in the story that he can no longer objectively write it?

In other words, yes, I would.

The Phrase Finder

... lists more than 1,400 phrases and idioms, offers a "phrase of the week" email and has a lively forum for discussing the origin and meaning of phrases.

The first item I came across perusing the forums was a discussion of "to disappear up one's own arse":

Sad to say, that's a phrase I know all too well from personal experience.

ReporTwitters

The brand-spanking-new ReporTwitters is a "Twitter platform dedicated to reporters":

  • Let the world experience first hand how you go about your reporting. From the minute you think of an article idea until you hit the 'publish now' button.
  • You'll have the chance to work alongside other reporters in the thralls of fast paced news.
  • Even use Twitter for interviews! Just twitter about the necessity of the interview and invite the candidate to our site, where your twitters can set the scene for a lively interaction.
  • Have another reporTwitter proofread your stuff in return for your proofing theirs, using the Twitter messaging feature.
  • Your work might be spotted by news editors on the lookout for that expert on biofuel gases in Avignon, France. Or who are canvasing the site for that hard nose generalist that speaks three billion languages.
  •  

    I've been around reporters long enough to know that a continual brain dump from one is a truly unpleasant thing, but to each his or her own.

    Thursday, August 30, 2007

    LegiStorm Congressional travel database

    LegiStorm, whose Congressional staff salary database was mentioned here previously, also offers a database of Congressional travel. Their travel database identifies "which trips took place at a time and location coinciding with major events - like the Superbowl or Mardi Gras - which may have provided additional travel incentive."

    BusinessDictionary.com, an Online Business Dictionary

    BusinessDictionary.com says its more than 20,000 definitions cover "every aspect of the business world."

    Wednesday, August 29, 2007

    Evolution personal data mining software

    Evolution is software that searches online sources, such as social networking sites, Wikipedia, Google Books and phone directories, for names, numbers, email addresses, phrases and Web domains, and graphically displays the links between them. I installed it after reading a Linux.com article that called it "a kick-ass application, just seething with power and potential":

    Still don't grok it? Think of the NSA sifting through network traffic, looking for actionable intelligence. Or if that's too conspiracy-minded for your taste, think of trying to find something new and meaningful in the results of a Google search on Paris Hilton. Evolution is kind of like that, but more aggressive in finding results, and a lot more aggressive in trying to make sense of them.

    I searched for just my name and it quickly found my telephone number, my email addresses, my wife, open record appeals I had written, this blog and my personal Web site, references to me on other sites, my latitude and longitude and home city, Google Book hits citing my name and more. You can then repeat searches (called "transforms") on entities it finds, exploring potential relationships. It also turned up a lot of information unrelated or only marginally related to me, and of course it's a lot easier to distinguish the relevant from the irrelevant if you're researching yourself. If you start cold on someone you don't know much about, you're going to have to do a lot more leg work to nail down what's meaningful and what isn't. Still, I was impressed. It runs on Windows, Macintosh and Linux. The Linux.com article quoted the creator, Roelof Temmingh, saying he's undecided about what to do with the Evolution, which is free, at least for now, and still in beta: "He said that he needs to make some money from Evolution or it will die," the article said. "He is considering everything from advertising to subscriptions, or selling the GUI and transforms, or selling only the GUI and making the transforms open source, and he is open to other suggestions."

    UPDATE: You can no longer download Evolution from its Web site. Linux.com reports that the creator announced he had removed it "due to circumstances outside of my control. I am not sure how long this outage will last, but perhaps it will be permanent." The site now says that if you want to see what Evolution can do, contact evolution@paterva.com.

    Sullr reverse telephone directory

    Sullr is a reverse telephone directory that plots the names and addresses it finds on a Google map. It covers the U.S., Argentina, Belgium, Italy, France, Luxembourg and Spain. The creators, who are from Argentina, say they are working to add more countries, "as long as the laws of each country allow it." I like this refreshing honesty from their FAQ, which answers the question, "Is Sullr a free service?":

    Of course! Do you have any ideas on how to make it profitable?

    Friday, August 24, 2007

    A widget and XML for tracking presidential campaign fundraising


    Maplight.org today released a widget "that allow anyone to track presidential fundraising on their own blogs, social media sites, and personal Web sites. " The widget is a customizable bar chart showing amounts raised by each candidate. The site, devoted to exposing the relationship between money and politics, also unveiled an API -- application programming interface -- "that makes it easy for any Web developer to build their own site or software program that displays or shares up-to-date campaign contributions from the FEC. " The API returns XML summarizing contributions for one or more candidates. All this is free and open source. Maplight's uniqueness is that it doesn't just provide campaign finance data -- it relates that data to actual votes, showing the correlations between the two. Here's a screencast that shows how the site works. Right now it has data for U.S. Congress and California legislators, and plans to expand to include the ten most populous states. Maplight, incidentally, received a grant from the aforementioned Sunlight Foundation.

    Bogus software awards

    The blogger at Successful Software says he created a fake program that doesn't run, "Awardmestars," and it won 16 different awards from software download sites. Many Web sites selling software display icons from these download sites proclaiming that their programs have earned a "5-star rating":

    The obvious explanation is that some download sites give an award to every piece of software submitted to them. In return they hope that the author will display the award with a link back to them. The back link then potentially increases traffic to their site directly (through clicks on the award link) and indirectly (through improved page rank from the incoming links). The author gets some awards to impress their potential clients and the download site gets additional traffic.

    I like some of the comebacks on Slashdot:

    He's obviously missing the point. Among all of the software that does nothing, his is clearly the best.

    and

    You think that's bad, the guy in the next office to mine has a coffee cup that reads "World's Greatest Dad!   

       

    Thursday, August 23, 2007

    Free + database = Freebase

    Freebase, as any Richard Pryor fan knows, is a smokable form of cocaine. It's also "a uniquely structured database that you can easily search, add to and edit":

    It's a data commons in the way that a public square is a land commons—available to anyone to use.

    Freebase covers millions of topics in hundreds of categories. It's been seeded with a few million topics from open sources, including Wikipedia and Musicbrainz, and while the first topics have mostly been in media categories like movies, music, and television, the Freebase community has already added thousands more topics on subjects from philosophy to European railway stations to the chemical properties of ingredients.

    I first learned of it way back in March, when only the Web 2.0 avant garde was allowed in. I also listened to a podcast with one of the co-founders last month, although truthfully I still don't have a very good handle on what it's all about. It's still in "alpha" and "read only" unless you've been granted "write" access. I was given write access a while ago, but haven't yet found the time to give it a try. If you're interested in checking it out yourself, I have free invitations to give away to the first ten people who email me.

    Podcast with the Sunlight Foundation's chief data architect

    If you're into computer-assisted reporting and public accountability you need to pay attention to The Sunlight Foundation, which is doing great work making public data more accessible. In a podcast on IT Conversations, Jon Udell interviews the Sunlight Foundation's Greg Elin, its chief data architect, who discusses the technical background of projects such as LOUIS, the Open House Project, Open Congress and What's McConnell Hiding.

    PolitiFact and the transformative power of Django

    Matt Waite of the St. Petersburg Times is the chief developer of PolitFact, which he says "marks a major shift" in his career:

    The site is a simple, old newspaper concept that's been fundamentally redesigned for the web. We've taken the political "truth squad" story, where a reporter takes a campaign commercial or a stump speech, fact checks it and writes a story. We've taken that concept, blown it apart into it's fundamental pieces, and reassembled it into a data-driven website covering the 2008 presidential election.

    The whole site is inspired by Adrian Holovaty's manifesto on the fundamental way newspaper websites need to change. Adrian's main theme was that certain kinds of newspaper content have consistent pieces that could be better served to the reader from a database instead of a newspaper story. I built PolitiFact with that in mind.

    Essentially the site rates the truthfulness of statements made by the presidential candidates. I especially like that its "Truth-o-meter" rejects mealymouthed phrasing and instead boils statements down to "TRUE," "MOSTLY TRUE," "HALF TRUE," "FALSE" AND "PANTS ON FIRE." More impressive is that Waite, who had lots of help, had never developed a Web site before. He created the site with Django, an open source Web development framework that uses the Python programming language. He says using it "has been a transformative experience."

    Beyond being an experiment in journalism or web development, PolitiFact is an experiment in entrepreneurship. We've developed a product that uses reporting labor from the St. Petersburg Times and our sister company Congressional Quarterly to create something that doesn't originate in print. All the talk and all the focus lately in web journalism circles is on local, local, local and to some degree they're right. But there's also something to be said for just putting a good idea on the web that people might find useful. We think we've done that. Now the important part: how are people going to respond? We have no idea. We're anxious to find out.

    Twitter people search

    Twitter has added people search. " … it searches profile information such as name, location, bio, and url," an email from Twitter said.

    Tuesday, August 21, 2007

    Embed a custom Google map on your Web site

    You will see a lot more Google Maps on newspaper Web pages now that Google has made it easy to embed customized versions of its maps anywhere. Google Maps Mania shows you how it works. This doesn't have the flexibility that comes from hand-coding them yourself, but now you can make a dynamic locator map in minutes, and given how hard that was just a short while ago, that's no small thing.

    Spock people search engine

    Spock is a new people search engine. SearchEngineWatch quoted a Spock "VP of product" recently who said it has indexed more than 100 million people so far and plans to eventually index billions. Tim O'Reilly gushed about Spock months ago, when only a chosen few were allowed access, saying it "performs a unique function that is well outside the range of capabilities of current search engines. What's more, it's got a fabulous interface for harvesting user contribution to improve its results." Yawn. I won't be impressed until it can tell me more about myself than Google. Right now, the only Mark Schaver it knows is some New York salesman, and who could possible by interested in him?

    Monday, August 20, 2007

    Overcoming Bias blog

    Overcoming Bias is a blog from the Future of Humanity Institute at Oxford University:

    How can we better believe what is true? While it is of course useful to seek and study relevant information, our minds are full of natural tendencies to bias our beliefs via overconfidence, wishful thinking, and so on. Worse, our minds seem to have a natural tendency to convince us we that are aware of and have adequately corrected for such biases, when we have done no such thing.

    In this forum we discuss whether and how we might avoid this fate, by spending a bit less effort on each specific topic, and a bit more effort on the general topic of how to be less biased. Here we discuss common patterns of bias and self-deception, statistical and other formal analysis tools, computational and data-gathering aids, and social institutions which may discourage bias and encourage its correction. Other topics may be discussed to the extent they exemplify important biases and correction issues.

    Institutional Review Board Watch

    ... describes its reason for being this way:

    Institutional review boards have been set up at nearly all research institutions in the US, to protect the welfare of human research participants.

    Over the past decade, IRB's have grown greatly in power and range of authority. The home institutions have, however, largely abrogated their responsiblity to oversee and control the procedures followed by IRB's. As a consequence, the IRB's have increasingly harrassed researchers and slowed down important research, without protecting any human research participants.

    The purpose of this site is to chronicle the abuses by IRB's.

    I'd say one of the first rules of being a watchdog is to explain who you are, which this site doesn't do as far as I can see. Or at the very least, you should explain why you can't explain who you are.

    I learned of this site from a posting on Statistical Modeling, Causal Inference, and Social Science, which agrees the process can "get a bit Kafka-esque."

    "Scrubbing and promoting your online image or uncovering someone else’s"

    is the subject of a recent post at PIbuzz.com.

    Thursday, August 16, 2007

    New cell phone directory

    The increasing use of cell phones has made life more difficult for reporters because unlike for land lines, there are no comprehensive cell phone directories. Intelius, however, has unveiled a cell phone directory that it claims has more than 120 million listings. You can search for free, but to actually see a number you have to pay $15 -- less if you sign up to be a "Club Intelius member." The Seattle Times reports that Intelius says it soon hopes to have 240 million listings, equivalent to "nearly every single subscriber's digits in the U.S." The Times quotes former NFL football player and current cell phone industry lobbyist Steve Largent calling the service "a scam" because the company keeps the money it charges even if the information it provides is wrong. The story quotes an Intelius co-founder, Ed Petersen, saying they will give refunds on reverse number searches that turn up nothing, but not if information it does turn up turns out to be incorrect. He says the company gets its cell phone information from marketing companies and public records.

    Wednesday, August 15, 2007

    Create and share timelines with xtimeline

    xtimeline "is a place for you to create, share and discuss interesting timelines." You can do it by feeding the free service an RSS feed or a spreadsheet, and you can embed timelines on your own Web site.

    I'll take a screwdriver with my unemployment

    Henry Blodget of the Silicon Alley Insider runs the numbers and explains "Why Newspapers are Screwed":

    Do you know why they're screwed? It's actually not the cost of paper, ink, trucks, printing plants, and other physical distribution expenses. Rather, it's the cost of content creation.

    Senior New York Times reporters believe they are underpaid, and, relative to other highly educated folks at the peak of their professions, they sure are. But relative to the online revenue they generate, those talented reporters, columnists, editors, and researchers actually cost a fortune.

    Web data browser

    Kirix Strata is a browser for gathering and manipulating data on the Web. You point Strata at a CSV file, an HTML table or an RSS feed on a Web site and it extracts the data, which you can then filter, join with other data, manipulate with formulas, export to other formats or upload to a database. A screencast demonstration says Strata "gives database structure to Web data." "… it was built to solve the problem of data usability," the company explains on its blog. "Basically, we're trying to give people the ability to handle structured data really easily, wherever they may encounter it." I gave Strata a try and did find it easy to use, and, at least for some operations, faster than manually grabbing data online and massaging it with scripting languages, spreadsheets and databases, as I normally do. There are Windows and Linux versions, with a Mac version promised for later. It's free only for now, while it's in beta, though they promise free licenses for the final version to users who give them feedback.

    Tuesday, August 14, 2007

    Site unmasks organizations making anonymous Wikipedia edits

    A CalTech graduate student has created a site, WikiScanner, that reveals organizations where employees have made anonymous edits of Wikipedia, Wired reports:

    On November 17th, 2005, an anonymous Wikipedia user deleted 15 paragraphs from an article on e-voting machine-vendor Diebold, excising an entire section critical of the company's machines. While anonymous, such changes typically leave behind digital fingerprints offering hints about the contributor, such as the location of the computer used to make the edits.

    In this case, the changes came from an IP address reserved for the corporate offices of Diebold itself. And it is far from an isolated case.

    Other organizations fingered include the CIA, Microsoft and Congress. The site appears to be struggling with the attention it's already generated, because I tried entering The Courier-Journal in the search box and it churned away for several minutes before telling me it could find no anonymous edits by anyone from our newspaper.

    Overview of online note-taking tools

    Web Worker Daily gives a brief overview of "7 Apps for Online Note-Taking":

    If you're like most of us, you deal with piles of unstructured information every day: phone numbers, ideas for later consideration, snippets of information from the web, recipes, phone messages…the list is endless. For the web worker, moving this information into an online notebook can be an attractive proposition. Rather than tie yourself to one computer, or even one operating system, you can get at your notes from anywhere that has a web browser handy.

    I've tried a couple of these, such as Google Notebook and Zoho Notebook, but for various reasons found them wanting. My preferred applications for collecting Web flotsam these days are the offline OneNote and ClipMate.

    Tuesday, August 7, 2007

    Journal Info

    Journal Info provides information on more than 18,000 academic journals organized by subject. The site, by Sweden's Lund University Libraries, is intended to make it easier for researchers to choose where to publish their results. It's also a useful tool because it rates journals' influence, helps you find free journals, and tells you which database services distribute a journal's archives.

    Cyndi's List: Genealogy Sites on The Internet

    Cyndi's List, "Your genealogy starting point online for more than a decade!," says it has more than 261,000 links of interest to genealogists. This can be useful stuff if you're wanting to find the dead or their living kin.

    Thursday, August 2, 2007

    Modern Approaches to Data Visualization

    ... as corralled by Smashing Magazine:
    " ... to convey a message to your readers effectively, sometimes you need more than just a simple pie chart of your results. In fact, there are much better, profound, creative and absolutely fascinating ways to visualize data. Many of them might become ubiquitous in the next few years."

    Wednesday, August 1, 2007

    Understanding FBI Records

    Courtesy of FOIA Facts:
    "Many FOIA requesters are confused when they make a request to the FBI and get a 'no record' response even though they are sure that there is a record on the subject of their request at the FBI. The FBI isn't lying-they just have devised a system that makes requesters to go through hoops to find the information they are seeking."

    Daily Writing Tips on writing numbers and numerals

    Daily Writing Tips offers "10 Rules for Writing Numbers and Numerals," although from the comments, not everyone agrees they're all valid. Daily Writing Tips, written by a 4-member team, calls itself "a blog where you will find simple yet effective tips to improve your writing."