Select glossory for digital media by Steve Myers

September 11, 2012

The days are over when a journalist could ignore those geeks in the corner who typed lines of code, worked on the website and spoke in a bizarre language populated with acronyms. Any journalist’s story now may be distributed with an API; information gathered by a reporter could be used in a mashup or shared via Scribd.

This glossary will help you wade through such terms. They relate to Web standards, programming, online tools, social networking, online advertising and basic technology. If you’re particularly challenged, this thing called an iPhone is even defined for you. Get started now — the list will grow more quickly than your relatives join Facebook.

–Steve Myers

Android — Usually used in the context of Android phone, Android is a free and open source operating system developed by Google that powers a variety of mobile phones from different manufacturers and carriers. It is a rival of the iPhone platform. In contrast to Apple’s tightly controlled architecture and App Store, Android allows users to install apps from the Android Market and from other channels, such as directly from a developer’s website — which allows for X-rated content, for example. Some well-known Android phones are the Nexus One, the Motorola Droid and HTC Evo. Expect to see competitors to the iPad running a version of Android.

App — Short for application, a program that runs inside another service. Many mobile phones allow apps to be downloaded, leading to a burgeoning economy for modestly priced software. Can also refer to a program or tool that can be used within a website. Apps generally are built using software toolkits provided by the underlying service, whether it is iPhone or Facebook.

Blog — One of the first widespread web-native publishing formats, generally characterized by reverse chronological ordering, rapid response, linking, and robust commenting. While originally perceived to be light on reporting and heavy on commentary, a number of blogs are now thoroughly reported, and legacy media organizations have also launched various blogs. Originally short for “web log,” blog is now an accepted word in Scrabble.

Blogger — A simple, free blogging platform created by Pyra Labs, which was sold to Google in 2003. It was one of the first mass blogging services and is credited with popularizing the format. Unlike WordPress, it is not open source. Many Blogger sites are hosted at

Social media — A broad term referring to the wide swath of content creation and consumption that is enabled by the many-to-many distributed infrastructure of the Internet. Unlike legacy media, where the audience is usually on the receiving end of content creation, social media generally allows three stages of interaction with content: 1) producing, 2) consuming and 3) sharing. Social media is incredibly broad and refers to blogging, wikis, video-sharing sites like YouTube, photo-sharing sites like Flickr and social networking sites like Facebook and Twitter.

Twitter — A microblogging and social media service where users can send out messages limited to 14o characters. Launched in 2007, Twitter became popular in part because it had a set of APIs that allowed other developers to build tools on top if it. Twitter users came up with their own conventions, including the @ symbol to denote user names (@nytimes), and #, the hashtag, to denote subjects (#sxsw). Twitter computes Trending Topics, which give a real-time view into the most talked about topics on the service.

Wiki — A web site with pages that can be easily edited by visitors using their web browser, but generally now gaining acceptance as a prefix to mean “collaborative.” Ward Cunningham created the first wiki, naming it WikiWikiWeb after the Hawaiian word for “quick.” A wiki enables the audience to contribute to a knowledge base on a topic or share information within an organization, like a newsroom. The best-known wiki in existence is Wikipedia, which burst onto the scene around 2000 as one of the first examples of mass collaborative information aggregation. Other sites that have been branded “wiki” include Wikinews, Wikitravel, and WikiLeaks (which was originally but is no longer a wiki).

WordPress — The most popular blogging software in use today, in large part because it is free and relatively powerful, yet easy to use. First released by Matt Mullenweg in 2003, WordPress attracts contributions from a large community of programmers and designers who give it additional functionality and visual themes. Sites that use WordPress include the New York Times blogs, CNN and the LOLCats network. It has been criticized for security flaws.

API (Application Programming Interface) — The way computer programs share data and functionality with other computer programs. APIs are an increasingly critical part of the Internet’s interconnection. Many say that the future of the Internet lies in APIs because they help distribute and combine content. On the Web, APIs are generally special URLs that give back machine-readable data, in formats like JSON or XML, rather than human-readable data, which is usually HTML. Facebook, Twitter and Google Maps all have APIs that allow other websites or computer programs to use their underlying tools. The New York Times and NPR have also released APIs that allow other programs to draw on archives of movie reviews, restaurant reviews and articles.

Algorithm — A set of instructions or procedures used in order to accomplish a task, such as creating search results in Google. In the context of search, algorithms are used to provide the most relevant results first based on those instructions.

Civic media — An umbrella term describing media technologies that create a strong sense of engagement among residents through news and information. It is often used as a contrast to “citizen journalism” because it also encompasses mapping, wikis and databases. MIT has a Center for Future Civic Media.

Cloud computing — An increasingly popular computing model in which information and software are provided on demand from over the Internet rather than staying on local computers. Cloud computing is appealing because companies can reduce the amount they spend on their own computer servers and software but can also quickly and easily expand as the company grows. Examples of cloud computing applications include Google Docs and Yahoo Mail. Amazon offers two cloud computing services: EC2, which many start-ups now use as a cheap way to launch their products, and S3, an online storage system many companies use for cheap storage.

Client side — Referring to network software where work takes place on the user’s computer, the client, rather than at the central computer, known as the server. Advantages of doing so include speed and bandwidth. An example is Javascript, a programming language that allows developers to build interactivity into websites. The work is done within the browser, rather than at the hosting website. (See also server side)

CMS (Content Management System) — Software designed to organize large amounts of dynamic material for a website, usually consisting of at least templates and a database. It is generally synonymous with online publishing system. The material can include documents, photos or videos. While the first generation of content management systems were custom and proprietary, in recent years there has been a surge in free open-source systems such as Drupal, WordPress and Joomla. Content management systems are sometimes built custom from scratch with frameworks such as Ruby on Rails or Django.

CPA (Cost Per Action) — A pricing model in which the advertiser is charged for an ad based on how many users take a specific, pre-defined action–such as buying a product from an online store–based on viewing an ad. This is the “gold standard” for advertisers because it most directly matches the cost of an ad to its effectiveness. However, it’s not commonly used since it’s extremely difficult to measure: it is often unclear when or how to attribute an action to a specific ad. (Also sometimes referred to as Cost Per Acquisition.)

CPC (Cost Per Click) — A pricing model in which the advertiser is charged for an ad based on how many users click it. This is a common model for “search advertising” (the all-text ads associated with search results) and for text ads in general. CPC is well-suited for “directed” advertising, intended to prompt an immediate response, because a user’s clicking on an ad shows engagement with it. Google AdWords is generally priced on a CPC basis.

CPM (Cost Per Mille) — Cost per one thousand (often views). Much of online advertising — particularly display advertising — is priced on a CPM basis. (Mille = Latin for one thousand; we use “K” for “kilo” almost everywhere else in tech, but “M” for “mille” here, which causes some confusion.) CPM is well suited for “brand” or “awareness” advertising, in which the primary purpose of the ad is not necessarily to prompt an immediate response.

Creative Commons — A flexible set of copyright licenses that allow content creators to specify which rights they reserve and which they waive regarding their work that is supposed to codify collaborative spirit of the Internet. There are six main Creative Commons licenses based on four conditions that creators can choose to apply: Attribution, Share Alike, Non-Commercial, and No Derivative Works. The least restrictive of the licenses is Attribution, which grants anyone, from an individual to a large company, the right to distribute, display, or otherwise make use of the work so long as the creator is credited. The most restrictive is Attribution Non-Commercial No Derivatives, which grants only redistribution. First released in December 2002 by the nonprofit Creative Commons organization, which was inspired by the open source GNU GPL license, the licenses are now used on an estimated 130 million works worldwide. The glossary you are reading is released under a Creative Commons Attribution Share Alike license in an effort to encourage wide distribution and contribution. (Also see open source)

Data visualization — A growing area of content creation in which information is represented graphically and often interactively. This can be used for subjects as diverse as an analysis of a speech by the president and the popularity of baby names over time. While it has deep roots in academia, data visualization has begun to emerge on content sites as a way to handle the masses of data that are being made public, often by government. There are many tools for data visualizations, including Seattle-based Tableau and IBM’s Many Eyes. Data visualization should 1) tell a story, 2) allow users to ask their own questions and 3) start conversations.

Document-oriented database — An increasingly popular type of database. In contrast to relational databases, which rigidly require information to be stored in pre-defined tables, document-oriented databases are more free-flowing and flexible. This is important when you don’t know what is going to be thrown at you. Document-oriented databases retrieve information more quickly, but store it less efficiently. The same document-oriented database might let you store the information for an article (headline, byline, data, content, miscellaneous) or for a photo (file, photographer, date, cutline). MongoDB is a popular open source document-oriented database.

Drupal — A popular content management system known for a vibrant open-source community that creates diverse and robust extensions. Drupal is very powerful, but it is somewhat difficult to use for simple tasks when compared to WordPress. Drupal provides options to create a static website, a multi-user blog, an Internet forum or a community website for user-generated content. It is written in PHP and distributed under the GPL open source license. uses Drupal.

Facebook Connect — A technology from Facebook that allows a reader to log into a third-party website with their Facebook account, rather than creating a new profile for that website. Facebook Connect, which is an API, also allows the third parties to pull certain data from the user’s profile, such as his or her name and age. In turn, the reader’s activities on the website can also be displayed on her or his Facebook profile. Launched in 2007, Facebook Connect was one of the first examples of Facebook extending itself into a platform for the entire Web. (Also see OAuth, Open ID)

Facebook community page — Introduced in April 2010, community pages were created as a counterpart to “official fan pages,” which are built around a specific person, company, organization, product, or brand. In large part, community pages are mostly auto-generated around interests or affiliations found in people’s profiles, like cooking. There is not a way to actively add content to the page, unlike with Facebook groups. But because they are autogenerated, based on likes, they can quickly build gigantic memberships. Cooking, for example, has over 2 million fans. These pages are a bit confusing, and Facebook is still working on the kinks.

Facebook fan page — A Facebook profile for a specific person, product, company or organization, usually administered by official representatives. This is different from a Facebook personal page, which must be owned by an individual, and different from a Facebook community page, which is built around an interest not related to a brand, such as “cooking.” It is also different from a Facebook group. Fan pages can gather thousands or millions of fans though “likes,” and official posts by the page administrator generally go into the fans’ news streams. Once a page has more than 25 fans, it can claim a short form URL, such as or Facebook community and fan pages are strong players in ongoing efforts to bring content to people where they already are, instead of requiring them to come to the content.

Facebook group — Facebook groups are analogous to offline clubs. Unlike Facebook fan pages, groups do not have to be administered by official representatives. In addition, the activity posted in groups does not get pushed into users’ feeds. But as long as it has fewer than 5,000 members, Facebook groups are allowed to mass-message all their members.

Facebook personal page — A profile page tied to a single individual. What information is controlled (in theory) by the individual. However, because there is a 5,000-person limit to friends, some celebrities have fan pages instead. As of 2009, individuals can choose a username, which makes their page available at

Geotag — A piece of information that goes with content and contains geographically based information. Commonly used on photo sites such as Flickr or in conjunction with user-generated content, to show where a photo, video or article came from. There has been some discussion of its increasing relevance with geographically connected social networking sites, such as Foursquare. Twitter has implemented geotagging, and Facebook has announced plans to do so.

Google AdSense — Google’s online advertising network that allows content publishers to embed a piece of code to display Google ads on their sites. The ads are selected based on the content of the page. Ad revenue is split between Google and the publisher in an undisclosed proportion, generally believed to be two-thirds to the publisher. (Note: ads on Google’s own sites are covered by Google AdWords, not AdSense.)

Google AdWords — Google’s text-based flagship advertising product, which provides the lion’s share of the company revenue. Ads are displayed on Google’s own sites based on search terms that users type in, and advertisers pay only when the users click on them. The search terms, called keywords, are purchased by advertisers; availability of a given keyword is based in part on an auction system, and in part on the responsiveness of the audience.

iframe — An HTML tag that allows for one web page to be wholly included inside another; it is a popular way to create embeddable interactive features. Iframes are usually constructed via JavaScript as a way around web browsers’ security features, which try to prevent JavaScript on one page from quickly talking to JavaScript on an external page. Many security breaches have been designed using iframes.

iPad — Released in April 2010, the iPad is Apple’s tablet computing device, akin to a large iPod Touch; it uses the same operating system and development tools as the iPhone. It features a multitouch screen and comes in 3G and wifi versions. Some news organizations, including The New York Times, Wired and National Geographic, have created special applications designed for the iPad. Some have hoped that it would be the “Jesus” tablet that would breathe new life into legacy print publications. Upon its announcement in January 2010, many noted its name was reminiscent of feminine hygiene products.

iPhone — Apple’s smart phone has sold more than 50 million units worldwide since it launched in 2007. The first smartphone to introduce multitouch screen capability, it is considered in the same vertical as the Blackberry, Google’s Android and Palm Pre. The critical mass of iPhones, along with Apple’s pre-existing iTunes infrastructure, allowed Apple to launch the first truly robust marketplace for mobile applications, creating a whole new microeconomy for innovation.

iPod Touch — Essentially an iPhone without the phone. Slimmer than the iPhone, the iPod touch can play music and run iPhone apps. It connects to the Internet via wifi.

Mobile — An umbrella term in technology that was long synonymous with cellular phones but has since grown to encompass tablet computing (the iPad) and even netbooks. In retrospect, an early mobile technology was the pager. Sometimes the term is used interchangeably with “wireless.” It generally refers to untethered computing devices that can access the Internet over radiofrequency waves, though sometimes also via wi-fi. Mobile technology usually demands a different set of standards — design and otherwise — than desktop computers, and has opened up an entirely new area for geo-aware applications.

Open source — Open source refers to a philosophy and a means of developing and licensing software and other copyrighted works so that others are free to inspect, use and adapt the original source material. There are many open source licenses. Some licenses are considered permissive (e.g. MIT and BSD), allowing inclusion in proprietary works, while others (e.g. GNU GPL) require that the resulting derivative works remain under the same license if distributed. While the term originally stemmed from software practices, the concept has now been incorporated into other fields such as medicine and agriculture. Many of the most popular technologies used in content distribution, including languages and publishing platforms, are open source. The glossary you are reading was developed using open source methodology and is available under a Creative Commons license.

Operating system — A basic layer of software that controls computer hardware, allowing other applications to be built on it. The most popular operating systems today for desktop computers are the various versions of Microsoft Windows, Mac OS X and the open-source Linux. Smart phones also have operating systems. The Palm Pre uses webOS, numerous phones use Google’s Android operating system, and the iPhone uses iOS (formerly known as iPhone OS).

RSS (Really Simple Syndication) — A standard for websites to push their content to readers through Web formats to create regular updates through a “feed reader” or “RSS Reader.” The symbol is generally a orange square with radiating white quarter circles. (Also see Atom)

SEO (Search Engine Optimization) — A suite of techniques for improving how a website ranks on search engines such as Google. SEO is often divided into “white hat” techniques, which (to simplify) try to boost ranking by improving the quality of a website, and “black hat” techniques, which try to trick search engines into thinking a page is of higher quality than it actually is. SEO can also refer to individuals and companies that offer to provide search engine optimization for websites.

SEM (Search Engine Marketing) — A type of marketing that involves raising a company or product’s visibility in search engines by paying to have it appear in search results for a given word.

Structured thesaurus — A group of preferred terms created for editorial use to normalize and more effectively classify content. For example, the AP Stylebook is similar to (but includes more rules than) a structured thesaurus in that it gives writers preferred terms to use and standards to follow, so everyone following AP Style writes the word “website” the same way.

Tag — A common type of metadata used to describe a piece of content that associates it with other content that has the same tag. Tags can be specific terms, people, locations, etc. used in the content it is describing, or more general terms that may not be explicitly stated, such as themes. The term “tag” is also used in the context of markup languages, such as <title> identifying the name of the web page. In HTML, tags usually come in sets of open and closed, with the closed tag containing an extra slash (“/”) inside. For example: <title>This is the Title.</title>.

Taxonomy — A hierarchical classification system. In the world of content, this can be a hierarchy of terms (generally called nodes or entities) that are used to classify the category or subject content belongs to as well as terms that are included in the content. In many cases, website navigation systems appear taxonomical in that users narrow down from broad top-level categories to the granular feature they want to see. An ontology is similar to a taxonomy in that it is also a classification system with nodes or entities, but it is more complex and flexible because ontologies allow for non-hierarchical relationships. While in a taxonomy a node can be either a broader term or narrower term, in an ontology nodes can be related in any way.

Tumblr — A free short-form blogging platform that allows users to post images, video, links, quotes and audio. The company is based in New York City and competes with Posterous.

Transparency — In the context of news and information, a term describing openness about information that has become increasingly popular. In many cases it is used to refer to the transparency of government releasing data to journalists and to the public. It is often used in the context of journalists being open about their reporting process and material by sharing with their readers before the final project emerges or providing more context in addition to the final product.

URI (Uniform Resource Identifier) — The way to identify the location for something on the Internet. It is most familiarly in “http:” form, but also encompasses “ftp:” or “mailto:”

URL (Uniform Resource Locator) — Often used interchangeably with the “address” of a web page, such as All URLs are URIs, but not vice versa. While humans are familar with URLs as a way to see web pages, computer programs often use URLs to pass each other machine-readable content, such as RSS feeds or Twitter information. In addition, words that appear in URLs often help boost search rankings, which is why many content sites are now shifting to URLs with headlines as opposed to data strings.

