12 Chapter 12. Case Studies

Applying the Roadmap

Robert J. Glushko

Table of Contents

12.1. A Multi-generational Photo Collection

12.2. Knowledge Management for a Small Consulting Firm

12.3. Smarter Farming in Japan

12.4. Single-Source Textbook Publishing

12.5. Organizing a Kitchen

12.6. Earth Orbiting Satellites

12.7. CalBug and its Search Interface Redesign

12.8. Weekly Newspaper

12.9. The CODIS DNA Database

12.10. Honolulu Rail Transit

12.11. The Antikythera Mechanism

12.12. Autonomous Cars

12.13. IP Addressing in the Global Internet

12.14. The Art Genome Project

12.15. Making a Documentary Film

12.16. The Dabbawalas of Mumbai

12.17. Managing Information About Data Center Resources

12.18. Neuroscience Lab

12.19. A Nonprofit Book Publisher


We now fulfill the promise of this book, with a set of case study examples that apply the concepts and phases of the roadmap. (The first four case studies appeared in the first print edition of the book. All the others have been contributed by students or other readers of the book and edited for consistency.——Ed.)

Navigating This Chapter

the section called “A Multi-generational Photo Collection”

the section called “Knowledge Management for a Small Consulting Firm”

the section called “Smarter Farming in Japan”

the section called “Single-Source Textbook Publishing”

the section called “Organizing a Kitchen”

the section called “Earth Orbiting Satellites”

the section called “CalBug and its Search Interface Redesign”

the section called “Weekly Newspaper”

the section called “The CODIS DNA Database”

the section called “Honolulu Rail Transit”

the section called “The Antikythera Mechanism”

the section called “Autonomous Cars”

the section called “IP Addressing in the Global Internet”

the section called “The Art Genome Project”

the section called “Making a Documentary Film”

the section called “The Dabbawalas of Mumbai”

the section called “Managing Information About Data Center Resources”

the section called “Neuroscience Lab”

the section called “A Nonprofit Book Publisher”

Case Study Template

For the sake of consistency, we employ the questions posed in Chapter 2, Design Decisions in Organizing Systems as a template for the case studies. We remind you of six groups of design decisions, itemizing the most important dimensions in each group:

What is being organized? What is the scope and scale of the domain? What is the mixture of physical things, digital things, and information about things in the organizing system? Is the organizing system being designed to create a new resource collection, catalog an existing and closed resource collection, or manage a collection in which resources are continually added or deleted? Are the resources unique, or are they interchangeable members of a category? Do they follow a predictable “life cycle” with a “useful life”? Does the organizing system use the interaction resources created through its use, or are these interaction resources extracted and aggregated for use by another organizing system? (the section called “What Is Being Organized?”)Why is it being organized? What interactions or services will be supported, and for whom? Are the uses and users known or unknown? Are the users primarily people or computational processes? Does the organizing system need to satisfy personal, social, or institutional goals? (the section called “Why Is It Being Organized?”)How much is it being organized? What is the extent, granularity, or explicitness of description, classification, or relational structure being imposed? What organizing principles guide the organization? Are all resources organized to the same degree, or is the organization sparse and non-uniform? (the section called “How Much Is It Being Organized?”)When is it being organized? Is the organization imposed on resources when they are created, when they become part of the collection, when interactions occur with them, just in case, just in time, all the time? Is any of this organizing mandated by law or shaped by industry practices or cultural tradition? (the section called “When Is It Being Organized?”)How or by whom, or by what computational processes, is it being organized? Is the organization being performed by individuals, by informal groups, by formal groups, by professionals, by automated methods? Are the organizers also the users? Are there rules or roles that govern the organizing activities of different individuals or groups? (the section called “How (or by Whom) Is It Organized?”)Where is it being organized? Is the resource location constrained by design or by regulation? Are the resources positioned in a static location? Are the resources in transit or in motion? Does their location depend on other parameters, such as time? (the section called “Where is it being Organized?”)

As we discussed in the section called “Where is it being Organized?”, when location is a constraint, it will typically be identified as such in the other questions. As result, we will only examine “Where?” as distinct design dimension in cases where it is warranted.

A Multi-generational Photo Collection

Overview. Your grandfather has died, at age 91, and under his bed is a suitcase containing several photo albums with a few hundred photos. Some of them have captions, but many do not. What do you do with them?

Your first thought was to create a digital photo archive of Grandpa’s collection so that you and all your relatives could see them, and you would also want to generate accurate captions where none exist. Since you have an extensive digital photo collection of your own in a web-based application, perhaps you can combine the two collections to create a multi-generational photo organizing system.

This project involves digitization, archiving, social media issues, and negotiations with and collecting information from other family members who might have different views about what to do.

What is being organized? It is easy to find advice about how to digitize old photos, but there are more choices than you might think. What resolution and format should you use? Should you do the work yourself or send Grandpa’s precious photos to a service and take the risk that they might get lost? Should you do any restoration or enhancement of the photos as part of the digitization process?[646]

More fundamental design questions concern the scope and scale of the organizing system. If you are digitizing Grandpa’s photos and combining them with yours, you are skipping a generation. Should not you also include photos from your parents and the rest of Grandpa’s children? That generation has both printed photos and digital ones, but it is not as comfortable with computers as you are, and their digital photos are stored less systematically on a variety of CD-ROM, DVDs, flash memory sticks, and SD photo cards, making the digitizing and organizing work more complicated. Do these differences in storage media reflect an intentional arrangement that needs to be preserved? And what about that box full of Super 8 cartridges and VHS tapes with family videos on them, and the audio cassettes with recordings made at long-ago family gatherings?

A family history management system that includes many different resource types is a much bigger project than the one you contemplated when you first opened Grandpa’s suitcase. It is easier to consider using separate but related organizing systems for each media type, because there are many web-based applications you could use. In fact, there are far too many choices of web applications for you to consider. You might compare some for their functionality and usability, but given the long expected lifetime of your organizing system there are more critical considerations: whether the site is likely to last as long as your collection and, if it does not, how easily you can export your resources and resource descriptions.[647]

Why is it being organized? The overall goal of preserving Grandpa’s photos needs no justification, but is preservation the primary goal? Or, rather, is to enable access to the images for far-flung family members? Or is it to create a repository for family photos as they continue to be produced? Alternatively, is it less about the images themselves and perhaps more about collecting family history information contained in the photos, thus making the collection of metadata (accurate information about when and where the photo was taken, who is in it, etc.) most important?

These decisions determine requirements for the interactions that the photo organizing system must support, but the repertoire of interactions is mostly determined by the choice of photo storage and sharing application. Some applications combine photo storage in a cloud-based repository tied to a very powerful set of digital photography tools, but this functionality comes with complexity that would overwhelm your less technology-savvy relatives. They would be happy just to be able to browse and search for photos.

How much is it being organized? Because you realize that a carefully designed set of categories and a controlled tagging vocabulary will enable precise browsing and search, you chose an application that supports grouping and tagging. But not everyone should be allowed to group or tag photos, and maybe some of the more distant relatives can view photos but not add any.

Will your categories and tags include all of those that Grandpa used when he arranged pictures in albums and made notes on the back of many of them? Do you want to allow annotations? Maybe this is a picture from a vacation; if you go back to the same place, do you want to create an association between the pictures?

Do not forget to keep Grandpa’s original albums in a safe place, not under a bed somewhere.

When is it being organized? Once you create your categories and tags, you can require people to use them when they add new photos to the collection. Perhaps the existing resource descriptions can be completed or enhanced as a collective activity at a family reunion. Do not put this off too longthe people who can identify Grandpa’s sister Gladys, her second husband, and his sister in an uncaptioned photo are getting on in years.

How or by whom is it being organized? You have taken on the role of the editor and curator, but you cannot do everything and having a group of people involved will probably result in more robust organizing. A group can also better handle sticky situations like what to do if people get divorced or have a falling out with other family members; do pictures taken of or by them get deleted?

Other considerations. Maintenance of this collection for an indefinite time raises the important issue or a succession plan for the curator. If only one name is on the account and only that person knows the password, you run the risk of losing access to the photos if that person dies. One of Grandpa’s mistakes was dying without clearly specifying his intentions for his photo collection, so whatever you decide you should document carefully and include a continuity plan when you are no longer the curator.[648]

Knowledge Management for a Small Consulting Firm

Overview. A senior professor who has done part-time consulting for many years is very pleased when his latest book becomes a best-seller and he is inundated with new consulting opportunities. He decides to take a two-year leave of absence from his university to start a small consulting firm with several of his current and former graduate students as his junior consulting partners.

An organizing system for knowledge management is required, but what gets designed will depend on the scoping decision. Is the goal of the system to support the consulting business, or also to support ongoing and future research projects that sooner or later will generate the consulting opportunities?

What is being organized? The professor concludes that since his consulting is based on his research, he needs to include in the new knowledge management system his research articles and the raw and analyzed data that is discussed in the articles. These resources are already organized to a great extent according to the research project that led to their creation. These have been kept in the professor’s university office.

The professor also has a separate collection of consulting proposals, client reports, and presentations that he has made at client firms. Because of restrictive university rules about faculty consulting, the professor has always kept these resources in his home office rather than on campus.[649]

In addition to these existing resource types, it will be necessary to create new ones that make systematic and explicit information that the professor has managed in an informal and largely tacit manner. This includes consulting inquiries, information about prospects, and information about specific people in client firms.

Why is it being organized? The professor has usually just done one consulting project at a time, very opportunistically. He has often turned down projects that involved more work than he could do himself. He now sees the opportunity to do much more consulting and to take on more significant projects if he can leverage his expertise in a more efficient way.

The professor can take on the “rainmaker” role to secure new consulting engagements and make the important decisions, and he is confident that he can train and support his new staff of current and former students to do much of the actual consulting work.

The knowledge management system must enable everyone in the firm to access and contribute to project repositories that contain proposals, plans, work in progress, and project deliverables. Much of this work can be reused from one project to another, increasing the productivity of the firm and the quality of its deliverables.[650]

Just as it is essential that the professor’s knowledge is systematized and made available via a knowledge management system, so must the knowledge created by the new staff of consultants. The professor cannot expect that all of the students will work for him forever, so any knowledge that they acquire and create in the course of their work will be lost to the firm unless it is captured along with the professor’s.

The consulting firm probably will not have an indefinite lifetime. After his leave of absence, the professor might return to his university duties, perhaps on a part-time basis. The knowledge management system will enable him to leave the firm in someone else’s hands while enabling him to keep tabs on and possibly contribute to ongoing projects. Alternatively, if the firm is doing very well, perhaps the professor will resign his university position and take on the role of growing the firm. A larger consultancy might want to acquire the professor’s firm, and the firm’s valuation will in part be determined be the extent to which the firm’s capabilities and resources are documented in the knowledge management system.

How much is it being organized? A small firm has neither the money nor the people to invest in complex technology and a rigorous process for knowledge management, but appropriate technology is readily available and affordable. Decisions about organizing principles must be made that reflect the mix of consulting projects; resources might be organized in a shared file system by customer type, project type, the lead consultant, or all of these ways using a faceted classification approach.

Standard document templates and style sheets for the resource types created by consultants can be integrated into word processors and spreadsheets. Contact and customer management functionality can be licensed as a hosted application.

Many small teams make good use of wikis for knowledge management because they are very flexible in the amount of structure they impose.[651]

When is it being organized? The professor’s decision to take a leave of absence reflects his belief that getting the firm started quickly is essential if he is to capitalize on his recent bestselling book to generate consulting business. This makes managing the prospect pipeline and the proposal-writing process the highest priority targets for knowledge management.

Much of the other organizing work can emerge as adjuncts to consulting projects if some effort is made to coordinate the organizing across projects.

How or by whom is it being organized? Because many of the early organizing decisions have implications for the types of customers and projects that the firm can take on, only the professor is capable of making most of them. The principal goal of the knowledge management system is to enable the professor to delegate work to his consulting staff, so he needs to enlist them in the design of the organizing system to ensure it is effective.

Other considerations. As the consulting firm grows, it is inevitable that some consultants will be better than others at creating and using knowledge to create customer value, and they will expect to be compensated accordingly. It is essential for the ongoing success of the firm not to let this create disincentives for knowledge capture and sharing between consultants. The solution is to develop a company culture that promotes and rewards them.[652]

Smarter Farming in Japan

Overview. Unlike the first two case studies, this is an actual case rather than a hypothetical or composite one. It shares with the first two cases a focus on preserving valuable resources but in the radically different domain of farming.

This case concerns an initiative by Fujitsu, a Japanese technology firm, to apply “smart computing” and lean manufacturing techniques to the agricultural sector, which lags in technology use. Fujitsu is testing a “farm work management system” at six Japanese farms. In this case study we will focus on the farm highlighted in a 2011 Wall Street Journal story.[653]

This test farm is located in southern Japan. It has 60 different crops spread over 100 hectares (about 250 acres), an area slightly larger than the central campus of the University of California at Berkeley.

What is being organized? Sensors are deployed in each of 300 different farm plots to collect readings on temperature, soil, and moisture levels. Video cameras also monitor each plot.

The 72 relatively unskilled workers on the farm are also managed resources. Each of them carries a mobile phone for communication, transmission of pictures, and GPS tracking of their location.

Why is it being organized? The highest-level goal for Fujitsu is to expand its reach as a technology firm by applying the concepts of lean manufacturing, statistical process control, and continual improvement to new domains. Farming is an obvious choice in Japan because it is a relatively unproductive sector where the average age is over sixty. It is essential that farms use more computing capability to increase efficiency and to capture and reuse the scarce knowledge possessed by aging workers.

The Fujitsu farm work management system supports numerous types of interactions to achieve these goals. For example, workers can send pictures of infected crops for diagnosis by an expert farmer in the farm’s office, who can then investigate further by studying recorded video from the affected plot.

As more farms deploy the Fujitsu system, the aggregated knowledge and sensor information can be analyzed to enable economies of scale that will allow separate and widely distributed farms to function as if they were all part of a single large firm with centralized management.[654]

How much is it being organized? The current design of the system treats farm workers as relatively passive resources that are managed very closely. The system generates a daily schedule of planting, maintenance, harvesting, and other activities for each worker. At a daily wrap-up meeting the farm manager reviews each worker’s performance based on GPS and sensor readings.

The sensor data is analyzed and organized extensively by Fujitsu computers to make recommendations, both agricultural ones (e.g., what crop grows best in each plot and the work schedule that optimizes quality and yield) and business ones (the profitability of growing this crop on this plot of land).

When is it being organized? The farm work management system is continually organizing and reorganizing what it knows about the farm as it analyzes sensor and production information. In contrast, the information created by the workers is captured but its analysis is deferred to an expert.

It is conceivable that as the farm workers become more expert as a result of the guidance and instruction they receive from the system that they can be more autonomous and do more analysis and interpretation on their own. It is also likely that the inexorable forces of Moore’s law will enable more data collection and more processing of the sensor data at its time of collection, which might result in increased real-time information exchange with the workers.

How or by whom is it being organized? The physical organization of the farm, with 300 small plots of land with 60 different fruits and vegetables, is the legacy arrangement of the farm before the Fujitsu trial began. Because of the sizable investment that Fujitsu has made in the farm to deploy the system, it is likely that the farm manager defers to recommendations made by the system to change crop arrangements. So it is reasonable to conclude that most of the decisions about the organizing system are made by computational processes rather than by people.

Other considerations. Fujitsu built this system for managing farms, but there are several other resource domains with similar challenges about capturing and reusing operational knowledge: vineyards, forests, and fish farms come to mind.[655] It will be interesting to see if the farm work management system can be made more abstract and configurable so that the same system can be used in all of these domains.

Farm crops, vineyards, trees, and fish pens do not move around, so a more challenging application of sensor technologies arises with cattle herd management. Nevertheless, sensors inserted in the genitals of a female dairy cow can trigger a text message to a herd manager’s cell phone when the cow is in heat, preventing the economic loss of missing a reproductive cycle.[656]

Somewhat more remote domains for potential application of systems that combine sensor networks with workforce management include sales, field support, and logistics.

Single-Source Textbook Publishing

Overview. The fourth case is also an actual case—a self-referential one. It is a case study about the organizing system involved in the creation, production, and distribution of The Discipline of Organizing. See [(Glushko 2015)].

We have known since the beginning of this project that this book should not just be a conventional text. A printed book is an intellectual snapshot that is already dated in many respects the day it is published. In addition, the pedagogical goal of The Discipline of Organizing as a textbook for information schools and similar programs is made more difficult by the relentless growth of computing capability and the resulting technology innovation in our information-intensive economy and culture. We think that the emergence of ebook publishing opens up innovative possibilities as long as we can use a single set of source files to produce and update the print and digital versions of this book.

What is being organized? The content of this book began in early 2010 as more than 1000 slides and associated instructor notes for a graduate course “Information Organizing and Retrieval” that Robert J. Glushko, the primary author and editor of The Discipline of Organizing, was teaching at the University of California, Berkeley. These slides and notes were created in XML and transformed to HTML for presentation in a web browser.[657]

The first decision to be made about resource organization led to the iterative sorting of the slides from 26 lectures into the 10 chapters in the initial outline for the book. The second decision concerned the granularity of the new content resources being created for the book. The team of authors was organized by chapters, which made chapters the natural granularity for file management and version control. Because authors were widely dispersed we relied on the Dropbox cloud storage service to synchronize work. Nevertheless, the broad and deep topical coverage of the book meant that chapters had substantial internal structure (four levels of headings in some places), and many of these subsections became separately identified resources that moved from chapter to chapter until they found their natural home.

In addition to the text content and illustrations that make up the printed text, we needed to organize short videos, interactive examples, and other applications to incorporate in digital versions of the book.

Finally, it has been essential to view the software that transforms, assembles, formats, and assigns styles when turning source files into deliverable artifacts as resources that must be managed. For the first and second editions of the book, we were fortunate to get much of the software required to build both print and ebooks from O’Reilly and Associates, an innovative technology publisher that has been developing a single-source publishing system called Atlas. Because we have recently been experimenting with including richer interactivity and navigation capability, reader-controlled personalization, and other features that go beyond what Atlas enables, we now use our own custom-built single-source publishing system.

Why is it being organized? Publishing print and ebook versions of a text from the same source files is the only way to produce both in a cost-effective and maintainable fashion. Approaches that require any “hand-crafting” would make it impossible to revise the book on a timely schedule. Furthermore, a survey of Berkeley students in the summer of 2012 revealed a great diversity of preferred platforms for reading digital books that included laptop computers, Apple and Android tablets, and seven different dedicated ebook readers. Only an automated single-source publishing strategy could produce all these outputs.

The highly granular structure for the content resources that comprise this book makes cross-referencing vastly more precise, making it easier to use the book as a textbook and job aid. It will also make it easier to maintain and adapt the text for use in online courses. (The emerging best practice for online courses is to break up lectures and study content into smaller units than used in traditional classroom lectures.)

How much is it being organized? The nature and extent of resource organization for this book reflects its purpose of bringing together multiple disciplines that recognize organizing as a fundamental issue but from different perspectives. The book contains many specialized topics and domain-specific examples that might overwhelm the shared concepts. Our solution was to write a lean core text and to move much of the disciplinary and domain-specific content into tagged endnotes. These categories of endnotes are somewhat arbitrary, but the authoring task of identifying content to go into endnotes is a non-trivial one.

The extent of resource organization is also affected by the choice of XML vocabulary, and we carefully considered whether to choose DITA or DocBook. DITA has the benefit of having more native support for modular authoring and transparent customization and updating, but DocBook is much older and hence has better toolkits. We eventually chose DocBook.[658]

When is it being organized? Despite the fact that the lecture notes with which the book began were in XML, we decided to author the book using Microsoft Word. Many of the authors had little experience with XML editors, and the highly developed commenting and revision management facilities in Word proved very useful. This tradeoff imposed the burden of converting files to XML during the production process, but only two of the authors were still working on the book at that stage, and both have decades of experience with hypertext markup languages.

How or by whom is it being organized? The chapter authors used Word style sheets in a careful manner, tagging text with styles rather than using formatting overrides. This enabled a conversion vendor to convert most of the book from Word to XML semi-automatically. Some cleanup of the markup is inevitable because of the ambiguity created when the source markup with Word styles is less granular than the target markup in XML. We do not know whether the amount of work left for us was atypical.

Nevertheless, waiting until the book was substantially finished to convert to XML meant that we were also deferring the effort to mark up the text with cross references, citations, glossary terms, and index entries, because these types of content were not included in the Word authoring templates and style sheets. As a result, a substantial amount of effort has been required of our copy and markup editor that could have been done by chapter editors if they had authored natively in XML. However, having a single markup editor has given this book a more consistent and complete bibliography, glossary, and index than would be have possible with multiple authors.

Other considerations. Because every bit of content in the book is tagged as either “core” or discipline-specific, our source files collectively represent a “family of books” with 2048 different members, any one of which we can build by filtering the content to include any combination from zero to eleven disciplines. It is impractical to publish this many editions, but we hope to use this flexibility to enable instructors to tailor the text for a wide range of courses in many different academic disciplines and customize the text for both graduate and undergraduate students. Better still would be an approach that defers the generation of a particular version of an ebook from “publishing time” to “reading time.” The same algorithms apply, but now the reader decides when and how to apply them, enabling the dynamic configuration of the book’s content. This radical capability is experimental as of August 2015, but we expect it to generally available before too long.

This design for a book challenges conventional definitions of book editions and forces us to imagine new ways to acknowledge collaborative authorship. But asking “What is The Discipline of Organizing?,” given these new authoring and publishing models, is a similar question to the one asked in Chapter 4, Resources in Organizing Systems, “What is Macbeth?”

Organizing a Kitchen

By Emilie Hardman, April 2013.

Overview. Just about everyone has a kitchen in their home or apartment, and most kitchens contain many of the same resources. These include pots and pans, dishes, bowls, drinking glasses, silverware, and cooking tools of various kinds. Kitchens are also often the location for organizing food items, cooking ingredients, spices, wine, and other beverages. Kitchens also invariably contain refrigerators and freezers for storing prepared and preserved food.

The organizing system for a kitchen is highly influenced by the size, shape, and arrangement of the counters, cabinets, shelves, and other parts of the physical environment of the kitchen. A person building a new home might be able to design this kitchen environment, but most people treat this as a given and work within its affordances, often because there are limits to how much the physical environment can be easily changed.

Kitchen Organizing System


My kitchen. I did my annual deep kitchen clean and it deserved a picture.

(Photo by Emilie Hardman. Creative Commons CC-BY-SA-2.0 license.)

What is being organized? Our wine, wine glasses, cocktail glasses and ingredients, as well as tea and coffee stuff were stored in the cabinet by the fridge, close to the center worktable so people could have easy access to them. Because of space limitations, this meant that our water glasses had to be somewhere else, but as we would usually put out water for dinner parties or have a pitcher and glasses on a tray when people came over, we thought this was reasonable, since the things people would most often be looking for and need easy access to for themselves would be these more social drinking glasses.

We also bought a freestanding worktable with a butcher’s block and stainless steel for pastry and chocolate work, as well as extra counter space in general. It worked as a prep space and as an area to lay out finished dishes or drinks for people to serve themselves when we had parties.

Some kitchen tools were kept with the food items to which they applied: for example, the coffee and the coffee grinder, or the cutting board, toaster, bread knife and bread all together. Other tools were kept with like tools: potato peelers, julienne tools, knives, etc. This was probably because of the kind of flexibility something like a potato peeler would have versus a coffee grinder; it also made more sense to put lots of these little things together in a drawer rather than leave them strewn out around the apples or potatoes.

Pots and pans had their own spaces and were stacked within one another; same with dishes. Most frequently used things were given preference over specialty tools.

Other things that were organized around the social dimensions of the kitchen were some food items and serving elements. For example, we used bowls to organize chocolate bars and treats that might easily be grabbed to set out and serve. Similarly, we kept stacks of serving bowls easily at hand so we could empty pretzels or tortilla chips, olives, etc., quickly and casually.

Why is it being organized? We wanted to emphasize a feeling of comfort and openness in our kitchen, so people would feel free to get what they wanted when they needed it. It also had to work on a practical level to be an efficient work area in a small space, so those concerns had to be balanced as well.

When is it being organized? We ended up moving silverware at one point because friends would consistently open a particular drawer in our center work island to look for silverware. Initially, I had specialized tools in that drawer because they were what I would reach for when I was working on something like making chocolates, but because of the continuous confusion, we moved those tools to another drawer and put the silverware where people seemed to expect it.

The fridge and freezer was organized by type of food for orderliness, ease of access, and immediacy of knowing when we had. We have a pull-out freezer, so things could get a little hidden, but assuming no one had compromised the system, you would know it was frozen fruit all of the way down in one segment and flours in another.

Some food items demanded different placement or storage based on their ripeness, the season, etc. In August we might be overrun with tomatoes, for example, and the window sills would fill up with them, whereas we would usually put them in a bowl if there were just a few.

How or by whom is it being organized? I think one thing to sum up would be to say that my partner is a librarian and I am trained as an archivist. We both care about classification and public service, so as people who also entertain a lot, I think these very practical and intuitive systems of grouping things is a motivation.

My father, an engineer who in his retirement does a lot of woodworking, built two cabinets that would just fit into the space and provide more storage than the two upper cabinets and three base cabinets provided in the kitchen.

Other considerations. The whole kitchen was not organized around guests, though. We also arranged things to be practical for cooking and for space saving. Food in the cabinets was organized by general function: for example, there was a shelf of dried beans in jars, another of dried chilies and spices—things that give flavor. Spices were organized within that by general type in rows and then alphabetically within those rows. This was because the rows helped group things which might be likely used together (e.g., cinnamon, cloves, mace, nutmeg) and alphabetically because so many of them look the same from the outside; knowing that the oregano would necessarily be shelved before the thyme was useful. Beans, though, because they are more immediately identifiable, less used, and certainly not as often used in concert (as one would with spices), I was a little more loose with and sometimes just arranged to a general aesthetic preference; if we had heirloom money beans, I might have preferred to see them over the standard red lentils, for example.

Earth Orbiting Satellites

By Daniel Brenners, December 2014.

Overview. Twenty two thousand miles above our heads, a global race for orbital real estate is underway. A single circular orbit around the Earth, called the geostationary Earth orbit (GEO), is the only area in space that allows a satellite to remain in a fixed point in the sky above Earth’s surface while it rotates.1 This prime location allows for satellites to have consistent communication with the ground below. Satellite television, a $100 billion industry, relies on satellites within the GEO to broadcast signals to homes across the world. Global positioning systems (GPS) and military applications also depend on satellites within this thin ring around the Earth. Unfortunately, space is severely limited in the GEO, and tension is growing over who gets to send their satellites to this valuable parking lot in the sky. The principles used to organize which satellites get to be placed in the GEO have many unforeseen legal and sociopolitical complications. As room becomes limited, it becomes increasingly important to find a solution to the problem of multiple organizing agents competing to organize this system to support varying interactions.

What is being organized? The scope of resources being organized are the satellites being deployed to the GEO. These are physical objects that have been launched into orbit. The satellites are each unique and are able to provide a variety of interactions. The only unifying attribute that they share is that they are computers that are able to send and receive radio signals to and from Earth. To stay in orbit, they are also able to adjust their position with propulsion systems.

This organizing system is designed to manage a collection in which resources are continually added and removed. The International Telecommunications Union (ITU) records which portions of the orbit are already occupied.2 Satellites cannot stay in the orbit forever, as they expend lots of energy performing computational processes and maintaining orbit, and eventually run out of power. The resources follow a lifecycle that is unique to each individual resource, but the timescale is typically one to fifteen years.3

Why is it being organized? Satellites are being organized in the GEO to support several interactions. The GEO allows satellites to move at the same rate as the Earth, giving it a stationary view of more than 40 percent of the Earth’s surface. Such a view is ideal for broadcasting signals to large regions and performing remote sensing, such as weather forecasting. They also serve as crucial relay points to transfer telecommunications across the globe. Other interactions that these satellites provide include surveillance, scientific research, global positioning, navigation, and military reconnaissance.3 Longitudinal positioning along the GEO shapes which interactions can occur and which users can interact with the satellite. For instance, a satellite directly over the Atlantic Ocean may not be well suited to broadcast a television signal, but may be positioned to relay signals from North America to Europe.

The users are practically everyone on Earth. Civilians use geostationary satellites directly when they use GPS or need to have a call relayed to distant regions of the world. Commercial organizations, such as television providers, use these satellites to broadcast signals down to viewers. Geostationary satellites are also particularly useful for early warning systems used by the military to detect ballistic events around the globe.

How much is it being organized? If resources are able to be placed in the GEO, they are placed in a vacant slot that the applicant chooses, based on what types of interactions they want to support and what users they want interacting with the satellite. To prevent signal interference and collision, satellites need to be placed very far apart, leaving only 2,000 total orbital slots where satellites can be placed in the GEO.4 The ITU uses a first-come, first-served organizing principle to decide which resources are placed into orbital slots, provided the applicant completes the lengthy application process.

The organization applying for the slot chooses where to place its satellite. The ITU catalogs these slots as degrees longitude, and includes other resource descriptions such as the name of the satellite, country of operator, types of users, mass, expected lifetime, and contractor.3 Organizations choose to place satellites around the longitude of the Earth that the satellite is supposed to interact with. Since the latitude is fixed at zero degrees, countries with the same longitude but different latitudes (countries directly north or south of each other) must vie for the same slots.

When is it being organized? Satellites are added as soon as they can be approved by the ITU and launched into orbit. At the end of their life cycle, the Federal Communications Commission mandates that U.S. satellites are pushed into what is called the graveyard orbit, which is a few hundred kilometers outside of the GEO.5 At this point, another satellite can be added to the vacant slot via the ITU application process.

How or by whom is it being organized? Many organizing agents are competing with each other to organize this system according to their own needs. Applications to occupy the GEO come from countries, scientific organizations, companies, and civilians. Satellite TV companies such as DirecTV, Dish Network, and Intelsat own a large number of the slots across the western hemisphere. Countries such as the United States, Russia, and the United Kingdom own a majority of the military satellites, and multinational European organizations own a large share of orbital slots as well.3

Other considerations. Although the ITU serves as an authoritative entity for this organizing system, the reality is that the ambiguous legality of ownership in outer space means that anyone can attempt to organize this system. The ITU is in place to perform the useful task of cataloging occupied slots and facilitating the filling of vacancies, but it has no way of enforcing these guidelines.

This organizing system is interesting because many agents are attempting to organize the same system. There are also interesting social implications that stem from the system’s principles of organization. The first-come, first-served system of the ITU has the effect of allowing only technologically advanced organizations to manage the collection. It does not take into consideration that by the time many countries are finally ready to use this type of technology, there will be no more room in the GEO belt.

Ironically, the only legal claim to sovereignty that has been made of this organizing system has been from countries that, generally, have no means of organizing it themselves. In 1976 eight equatorial countries, which lie directly below the GEO belt, stated that they had exclusive rights over these slots in a document known as the Bogotá Declaration.6 The tenuous claim was that the orbit is not a part of outer space, because its existence is solely dependent on Earth’s gravity, and that the earth within the borders of the equatorial countries creates GEO with its gravitational pull. Many experts disagree, stating that the gravitational pull from the moon and other celestial bodies defines the GEO, and state that the orbit does indeed lie in outer space because it is further than 100 kilometers from Earth. This demarcation, known as the Kármán line, is a widely accepted definition of when space begins.7 This would then make the GEO fall within the 1967 Outer Space Treaty, effectively leaving no possibility for ownership of the orbit.

Finding a dividing line between space and Earth’s atmosphere is an interesting topic, especially considering that ownership of valuable resources may be decided based on what is included in the category of space versus the category of atmosphere. In this case, the Kármán line roughly represents the altitude at which an aircraft would have to propel itself faster than the speed at which the Earth rotates to establish enough lift to keep itself up. While this is not intuitive (hardly carving nature at its joints), it does serve as a useful demarcation that is not completely arbitrary. It can be seen as a goal-based category, where the goal is using traditional means of traveling through the air using aeronautics. It makes sense that this is the line the Fédération Aéronautique Internationale uses to divide astronautics and aeronautics.

The limited availability of spots in the GEO, along with the relatively small number of countries able to launch satellites, has the potential to further divide countries. By the time most countries will be able to launch satellites, there will likely not be any room left. Although there are only around 400 satellites currently in geostationary orbit, there are already more filings for ITU applications than there are spots available.4 Only a select few countries will be able to take advantage of the GEO, leaving others to depend on these countries for communication, scientific research, and surveillance. Furthermore, this could limit the interactions of these less developed countries to those interactions dictated by the countries with geostationary satellites. In particular, these developed countries can greatly influence the information that citizens in other countries can receive via satellite.

But even within the technologically advanced countries, competition for orbital slots may be heating up. In early 2014, the US unveiled its Geosynchronous Space Situational Awareness Program (GSSAP), which aims to create maneuverable satellites that monitor and protect the precious GEO belt.8 This reveal comes only months after China was seen practicing its anti-satellite missile capabilities.9 In Russia, $300 million is being spent to construct a craft that would act as a “space broom” to push satellites out of geostationary orbit. The US has a similar program, named the Phoenix project under DARPA, developing a robotic device that can help maintain satellites and possibly dismantle others without causing excess space debris.

Although this might simply be countries attempting to flex their military muscles, these technologies represent a newfound ability for countries to organize resources in the GEO to fit their own agenda. Years ago, the countries that were able to get satellites into orbit were the ones that could reap the benefits. Now, it seems that we may be entering an age where a country’s ability to make room for itself, possibly by force, will determine if it can make use of precious interactions created by these limited resources.

Notes: The following notes relate to this case study.

NASA Jet Propulsion Laboratory Basics of Space Flight Section 1 Chapter 5: Planetary Orbits http://www2.jpl.nasa.gov/basics/bsf5-1.php

ITU Space Services Department (SSD) 2014 http://www.itu.int/ITU-R/go/space/en

Union of Concerned Scientists Satellite Database http://www.ucsusa.org/nuclear_weapons_and_global_security/solutions/space-weapons/ucs-satellite-database.html#.VJKNXmTF-5I

Posen M., Have We Got a Slot? RPC Telecommunications Ltd. World Space Forum Dubai March 2010 http://www.rpctelecom.com/files/Have We Got A Slot.pdf

De Selding P., FCC Enters Orbital Debris Debate. Space News, 28 Jun. 2004

Finch M., Limited Space: Allocating the Geostationary Orbit. Northwestern Journal of International Law Vol 7 Issue 4 Fall 1986

Haraszti G., Questions of International Law Volume 2. Akademiai Kiado Budapest 1981

Hsu J., Global Conflict Could Threaten Geostationary Satellites: China, Russia and the U.S. have the ability to destroy one another’s eyes in the sky. Scientific American March 31, 2014 http://www.scientificamerican.com/article/global-conflict-could-threaten-geostationary-satellites/

Shalal-Esa A. U.S. sees China launch as test of anti-satellite muscle. Reuters May 2013 http://www.reuters.com/article/2013/05/15/us-china-launch-idUSBRE94E07D20130515

CalBug and its Search Interface Redesign

By Gracen Brilmyer, December 2014.

Overview. The CalBug project, housed out of the The Essig Museum of Entomology at the University of California, Berkeley, is a collaborative initiative between nine California institutions with a goal to digitize over a million specimens. Digitization involves imaging both specimens and their labels as well as storing their collection info in a database. The CalBug project also is attempting to georeference, or locate the original latitude and longitude coordinates, for these million specimens (some dating back to the 18th century) so that they can be better used for research. The project uses many student workers, graduate students, and volunteers to capture the images and data. Over the past few years, it has participated in the Notes from Nature project, which helps connect citizen scientists to scientific research. Through the images generated of the specimen labels by the team at the Essig Museum, citizen scientists digitally transcribe the data that can be read from the image. The Essig, after each label is transcribed by 24 citizen scientists, runs an R program to find the most accurate transcription and transfer it into the Essig’s database. These combined efforts have accumulated in over 209,000 specimen records and over 400,000 images and counting. This project has a large scope and an ever-increasing scale.

What is being organized? The insect specimens in the CalBug project are digitized on an individual level, with unique identifying numbers, and new specimen records and their associated data are continually being added to the digital collection. Both the specimens and their data are being organized. Existing groups of specimens are prioritized for digitization and new physical specimens are accessioned into the collection and are databased upon arrival.

Why is it being organized? An individual specimen’s associated data can be highly variable; however, as long as a specimen has the time and place of its collection (no matter how vague) associated with it, it is valuable research material. The physical specimens are organized to facilitate the collection manager’s use of the collection. When physical specimens need to be borrowed, they must be efficiently found, packaged, and sent out on loan, so fastidious organization is key when locating thousands of specimens. The digital organization of the collection also facilitates the duties of museum staff and the collection manager by allowing for expanded interaction with the collection by using the database. The digital collection’s web interface, undergoing a redesign as of the time of this writing, makes the collection accessible for researchers and novices alike, as well as to foster data sharing to other data repositories. Since the specimen data follows digital curatorial standards, a web interface that allows these fields to be easily searchable and navigable can add to the use of the collection for a broader audience, which is a major impetus for the redesign.

Figure 12.1. CalBug search interface


CalBug’s redesigned web search interface


How much is it being organized? As discussed in the previous section, the specimens and their information are subject to multiple levels of organization, and each level of organization supports a different type of user. The data of the CalBug Project is organized according to Darwin Core (DwC), a standard “designed to facilitate the exchange of information about the geographic occurrence of species and the existence of specimens in collections.”1 Certain specimen attributes have concrete institutional parameters, such as unique identifying numbers and taxonomic identification, while others have less strict parameters (e.g. a precise location of where a specimen is found), although they still must use specific DwC fields. Although there are institutional taxonomies in place for information associated with a specimen’s collection and identification, the CalBug search interface design in Figure 12.1, “CalBug search interface” allows for an outward-­facing reorganization of the existing fields.

When is it being organized, and by whom? The categorization and organization happens at multiple times for one specimen. If identified, the specimen is already inserted into the taxonomic classification scheme—the hierarchy of how species are related. This scientific warrant is inherited and replicated in the physical curation of the collection, and specimens are further sorted (within a taxon) by geographic region. Aligning with taxonomic categories provides a clear hierarchy for sorting and locating physical specimens and, with changes in taxonomy having to be published, makes collection maintenance fairly consistent.

The specimens are organized a second time when they are databased, either by interns or through Notes from Nature. The data is stored in a MySQL database that uses mostly DwC fields, an institutional taxonomy for specimen data. The digitization of specimens, through utilizing DwC institutional semantics, makes collection maintenance, governance, and interaction easier, as the collection manager can search in a multifaceted manner, better understand the holdings of the museum, and track specimens for loans. The unique specimen numbers allow for individual tracking, and the other DwC fields provide multiple areas for accurate search and retrieval.

For the CalBug web search interface, the specimens retain their classification hierarchy within the database. However, the outward-facing search fields aim to serve a broader audience, not just the collection manager and museum staff. Thus the search application organizes the resources a third time “on the way out” of the database in response to a user query. As this design is optimized for researchers and students, the classification appears to focus more on taskonomy instead of the institutional taxonomy (see Figure 12.1, “CalBug search interface”). The 20 search fields provided in the search interface, while actually searching through the ~100 fields in the database, facilitate precise information retrieval. Although fewer search fields might yield lower accuracy, user testing has shown that the new search design improves accuracy by not requiring users to know exactly which DwC field to query.

Figure 12.2. Crosswalk table


This crosswalk table maps the fields in the CalBug search interface to the underlying database columns.


The search is further expanded by having a ‘Search any field’ box, which literally looks in every DwC field for a term, as well as a “Common Name” field, to support novice searches, such as “beetle” and “butterfly” instead of “coleoptera” and “lepidoptera.” The intrinsic properties of the specimens lend the results to simple (alphabetic and numeric) sorting as well as filtering (through the “Refine” option) on the list view of the results pages. Additional views of results, including a map view showing collection locations and a grid view that displays specimen photos, help users locate desired specimens and reorganize as needed to suit their needs.

Notes: 1. http://wiki.tdwg.org/twiki/bin/view/DarwinCore/WebHome

Weekly Newspaper

By Ian MacFarland, December 2013.

Overview. A weekly neighborhood newspaper in New York City now covers the entire borough of Queens. Rather than publish a single weekly edition for this highly diverse area of more than 2 million people, its owners have opted to produce 14 separate editions, each centered on a different neighborhood. All editions share a deadline, delivery schedule, and staff pool, but each has unique content tailored to its target readers.

What is being organized? The newspaper’s resources—its content—consist mainly of articles and photos generated by staff and freelance contributors throughout the week. Often, newspapers will assign their reporters to beats based on subject matter (politics, education, “cops and courts,” etc.), making them domain experts who cover stories on that beat throughout a wide geographical area. However, because of this paper’s historical orientation toward “hyper-local” neighborhood news, it has given each of its seven full-time reporters a more granular geographical beat that corresponds to two of the 14 editions’ coverage areas, within which they are responsible for general assignment reporting. Most reporters also have a specialty for covering news that is of more general interest throughout the borough, such as citywide government or transportation issues, and they will include coverage of these domains in their story budgets for the week as well. The staff maintains a centralized story list that includes a handful of resource descriptions for each story: its slug (an abbreviated, descriptive name, including tags for its relevant neighborhoods), its length, and whether it has “art.”

Why is it being organized? The media market in New York is crowded and extremely competitive, and this newspaper believes its competitive edge lies in its laser-focus on individual neighborhoods. Furthermore, most of its readers are subscribers who receive the paper in the mail, not newsstand buyers. As a result, the paper generally eschews the familiar tabloid approach of splashing the most salacious story of the week across the front page and usually fronts two stories that are “small-bore” but extremely relevant to the neighborhood, such as the doings of local school or government officials, notable crimes, or human-interest stories featuring neighborhood residents. The deeper into the paper one goes, the less local its content becomes, and stories often appear in more than one edition, in different locations and even with different headlines, to tailor them to an appropriate level of localization.

On a more general level, of course, the paper must support the conventional interactions all readers expect from newspapers. Readers are rarely expected to progress through the paper from front to back, so it supports a wide variety of reading styles; large headlines and photos and concise, compelling story “ledes” (opening paragraphs) facilitate skimming and scanning interactions, and dividing the paper into sections, such as “Opinion,” “Sports,” and “Arts & Entertainment,” lets readers skip directly to their areas of interest after turning past page one.

How much is it being organized? The level of organization behind the scenes at this small, local newspaper is surprisingly complex. The primary organizing principle that determines a story’s placement is its relevance, which is a function of location granularity (does it directly affect the people of this neighborhood? Did it happen here?), significance (will readers find it important?), and time (is it old news? Has anyone else reported it yet?). Counterbalancing that is the economic reality of the struggling newspaper industry, which results in often severely limited space for the news (because paper and press time are costly physical constraints) and manpower with which to produce all 14 editions before deadline. The result is a hierarchical system in which the 14 editions are categorized into three zones; in each zone, about two-thirds of the pages are common to all editions, and the remaining third (including, most crucially, pages one through three) are unique to each single edition. Thus, for instance, a general-interest story about transportation need not be laid out 14 separate times, but one about a fatal car accident can appear on page one for the neighborhoods where it occurred and where the victims were from, and further back (or not at all) for other neighborhoods.

When is it being organized? In a weekly news cycle, selection, creation, and organizing of editorial resources is largely concurrent. The story list is updated on a rolling basis throughout the week, and an article or photo’s placement in the paper is often planned based on its intended subject matter well in advance of when the resource is actually created. However, organizing must be completed long before it reaches its intended users, because the final layouts must be printed, collated, and mailed to readers, which, due to logistical concerns, takes several days—so the paper is laid out on Tuesday (as late as possible to maximize the window for ad sales), printed on Wednesday, and delivered by the Postal Service on Thursday or Friday.

How or by whom is it being organized? Human agents—specifically, editors—are the newspaper’s primary organizers. They rely heavily on the judgment of the reporters, who are most familiar with their beats, to determine a story’s relevance and placement for each edition, as well as their own news judgment, assessment of the story’s quality, and estimation of where the story will physically fit based on ad placements (which are decided first). The implementation of their organizing system is carried out by page layout designers, with some software automation on the part of the paper’s content management system.

Other considerations. Part of the grind of a weekly news cycle is that the effectivity of the paper’s resources is never guaranteed; when the next edition comes out, they all become yesterday’s news, and one never knows when new developments will render a story irrelevant or incorrect; in fact, because of the latency between layout and delivery, a story’s effectivity may even expire before its publication.

The CODIS DNA Database

By Becca Stanger, December 2013.

Overview. Operating on a local, state, and federal level, the Combined DNA Index System(CODIS) is the FBI DNA database. As of October 2013, the National DNA Index(NDIS), or the federal level of the CODIS, contained over 10,647,800 offender profiles, 1,677,100 arrestee profiles, and 522,200 forensic profiles. Designed to help solve crimes, this database has generated over 255,400 hits and has aided over 216,200 investigations. While this organizing system has played a crucial role in reducing crime by enabling more interactions in the law enforcement agency than ever before, it provokes numerous ethical questions worth exploring.

What is being organized? The CODIS database maintains digital records or DNA profiles” for a wide range of people involved in criminal justice cases, including convicted offenders, arrestees, missing persons, and more. Specifically, these profiles are measurements of one or two alleles of 13 predetermined unique genetic sequence loci. These precise measurements provide enough granularity for the profiles to uniquely identify a single individual.

These resource descriptions are generated, often with polymerase chain reaction technology, from the original DNA specimen resources by accredited laboratories nationwide. Upon creation, the resources themselves—the specimens—are kept at the laboratories, while the resource descriptions—the digital profiles—are added to the CODIS database. No offender personal identifiers are assigned to the profiles; however, information on the submitting agency, specimen, and personnel is stored with the profile.

Rather than focusing on collecting resource descriptions, the FBI could have chosen to collect the original resources themselves. Presumably, though, this level of coordination of physical resources (e.g., shipping, storage, maintenance, etc.) would have placed an additional cost on the federal government and required legislative approval. Thus, it is understandable that the FBI would choose to minimize cost and effort by focusing on the resource descriptions alone.

Why is it being organized? In the past, law enforcement agencies were limited to solving crimes within their geographic region. A detective working on a murder in California, for example, may never have heard of a related murder in New York. The CODIS database organizing system encourages that coordination between law enforcement agencies in an effort to solve crimes.

With 10,647,800 offender profiles in the NDIS alone, though, the massive CODIS database required an organizing system in order to prove useful to the law enforcement agencies involved. The successful creation and maintenance of this organizing system has offered newfound interactions to a wide variety of government officials. In addition to law enforcement agencies, judicial courts, criminal defense agencies, and population statistics agencies can access the CODIS organizing system, enabling them to perform a wide variety of functions, including identifying potential suspects in criminal investigations, identifying missing persons, collecting population statistics, and exonerating convicted criminals.

How much is it being organized? As mentioned previously, the high degree of resource description granularity in measuring 13 specific genetic sequence loci enables DNA profiles to uniquely identify each individual in the database. That being said, the DNA profiles are not simply heaped into one massive database.

Instead, the databases are maintained on both a state and federal level. A new profile might be checked against a smaller state database as well as the larger national one. In addition, the databases are divided into different indices dependent on the DNA source, including an offender index, arrestee index, forensic index, and missing persons index.

This division of the database into separate indices poses a tradeoff dilemma, though. If CODIS did not subdivide the database into federal, state and source indices, it is possible the algorithm would be able to find more obscure hits, since the search parameters would be broadened. This increase in hit frequency might result in more investigations aided.

The tradeoff, however, is that the broadened search parameters would also necessitate a more complex search algorithm and a longer search process. This delay would most likely lead to fewer hits overall. Thus, in government institutions where time and resources are limited, it is more important for the CODIS organizing system users to generate a larger number of hits with subdivided databases than more accurate hits in one collective database. Categories in the CODIS organizing system help simplify the interaction processes.

When is it being organized? DNA profiles enter the CODIS organizing system when participating, accredited local, state, and federal laboratories submit them. Thus, the laboratory technicians handling the resource and resource description decide on a case-by-case basis how a given profile should be categorized and which indices it should be added to and checked against.

That being said, the lab technicians are given strict standards on how a given DNA profile should be categorized. These standards vary state by state depending on state law.

How or by whom is it being organized? Beyond laboratory and state involvement in CODIS, the FBI ultimately maintains and oversees the CODIS database. It maintains the software and search algorithms, performs searches throughout the system, and oversees strict quality assurance standards for all participating laboratories.

To avoid the risk of bias or error amongst lab technicians, the FBI could potentially choose to instead perform the laboratory processing and categorization themselves. This alteration, however, would present new challenges, such as new federal costs related to maintaining and processing the resources mentioned previously. In addition, pulling together all resources into a FBI processing center would necessitate a meticulous record of the resource’s originating state to ensure resource descriptions are categorized in accordance with state laws. The FBI’s strict maintenance of standards and laws is the best option for addressing the risk of error and bias.

Other considerations. The CODIS organizing system presents a wide range of intriguing ethical questions surrounding race, gender, criminal justice, and privacy. Perhaps the most hotly debated issue surrounding DNA databases arose when the private DNA testing company 23andMe announced that it would discontinue the sale of its genetic tests in response to FDA demands, prompting more media questions than ever before on the maintenance and use of DNA databases.

Likewise, many have questioned the legitimacy of the CODIS maintenance of DNA profiles. The ACLU, for example, has noted the possibility of “function creep” in the maintenance of a government DNA database which could lead our country down a slippery slope towards a “brave new world” where private genetic information could be collected and used in abusive, discriminating manners.

With the commercial surveillance of 23andMe and government surveillance by the NSA at the forefront of media attention, it is possible we will see more attention turned to the legitimacy of the maintenance of the CODIS organizing system in the coming years.

Honolulu Rail Transit

By Carlo Liquido, December 2015

Overview. The Honolulu Rail Transit Project is an urban rail rapid transit system under construction in Honolulu on the island of O’ahu, Hawaii. Honolulu’s notoriously bad traffic has plagued locals and tourists for decades, and for almost as long, proposals to address the traffic problems and pay for the solution have been very contentious and political. Construction began in 2011 and is expected to finish in 2019, but delays have been frequent.

What resources are being used? The new railway transit system under construction in O’ahu will run along the southwest region of the island spanning a total of 20 miles, from East Kapolei to Downtown Honolulu with a total of 21 stops strategically placed throughout. There are a number of ways in which one could scope this project. What are the cultural and political limitations? What are the environmental effects and resources that will be indirectly affected? What are the topographic constraints of a railway system in Hawaii? In terms of the scope of my analysis, however, the people—namely the things the organizing system is intended for—are the primary resources. The principle guiding the organizing system is to reduce traffic and make the traveling experience more efficient as a whole.

O’ahu Traffic


O’ahu traffic is usually congested, especially in and around Honolulu on the south and southeast sides of the island

(Hawaii Dep. of Economic Development and Tourism. CC-BY-2.0.)

Why are the resources organized? The guiding principle behind the organizing system of a rail transit system is to reduce traffic and make commuting more efficient. According to the Department of Business, Economic Development and Tourism, the amount of traffic on almost every major highway on O’ahu has increased from 2012-2014. Moreover, the dearth of job creation on other parts of the island, namely the west side, has focused traffic into and out of downtown Honolulu, as shown in the first map.

This skewed traffic pattern, limited real estate, and inflexible road infrastructure has necessitated an above-ground railway system linking the west side of O’ahu with the burgeoning downtown area of Honolulu. This new organizing system seeks to rebalance the traffic system by reorganizing its resources, that is, by taking drivers and bus commuters off the road and onto the rail. O’ahu has only three major freeways, the H1, H2, and H3. The freeway H2 bottlenecks from the west into H1. Drivers and bus commuters are organized in such a way that peak hours of traffic are unavoidable. The new transit will conceivably provide an additional layer of organization to the currently

How much are the resources organized? There are 21 planned stations that run along the 20-mile span of track. The train stations are arranged to serve as many people as possible by concentrating them in the most densely populated areas.

Population Density


Population in Honolulu area is highly concentrated in lowland areas.

(Hawaii Dep. of Economic Development and Tourism. CC-BY-2.0.)

Darker areas represent high-density tracts while lighter areas represent low-density tracts. The densely-populated stretch from Keahi Lagoon to Honolulu Downtown, also has the highest density of traffic. It makes sense that this portion of the rail system constitutes almost half the number of total stops in just a quarter of the total mileage.

Honolulu Area Income per Household


Household income is lowest in the most densely populated areas.

(Hawaii Dep. of Economic Development and Tourism. CC-BY-2.0.)

Income per household also plays a vital role in how these stops were selected. The rail transit system predominately runs along areas of low-income neighborhoods (tan and brown indicates low income per household, while green indicates high income per household). This design principle embodies an assumption that people with lower incomes are more likely to rely on public transit.

When are the resources organized? As with any construction project of this magnitude, the organizing system was planned in detail before construction—down to the number of pillars, the amount of concrete, the imported steel for rail cars, etc. However, after construction excavation revealed ancient burial sites, the Native Hawaiian community demanded many changes to the project. The number of stops has remained the same but the route has changed dramatically.

How or by whom is it being organized? There are a number of interested parties with varying degrees of power. At the forefront, the government—that is, the State of Hawaii—makes the final decision. However, the people of Hawaii directly influence their decisions.

The protection of cultural resources, practices, and beliefs is important in Hawaii, both as a matter of law and of culture. Private archeology firms, state officials, and cultural descendants work together to reduce and mitigate impacts to archaeologically significant properties. The Oahu Island Burial Council, for instance, is a state council created to help protect iwi kupuna (ancestral bones). It stresses the importance of consulting recognized lineal descendants before any excavation for the rail project is carried out.

Where is it being organized? The “where” component of the organizing system is not as important for the scope of this analysis as other design questions. However, the physical nature of the project highly constrains how the system can be organized. The volcanic origin of O’ahu, does not allow for a below-ground rail system. The limited real estate, similarly, does not allow for a ground-level system. The sharp and steep volcanic ridges that cut across the island are barriers that limit where the rail system might go.

The Antikythera Mechanism

By Murray Maloney, 2 March 2014.

Overview. In 1900, a strange looking mechanical device was recovered from a shipwreck off of the island of Antikythera, Greece. Only in the 1970s was it determined that the device was an ancient mechanical computer that performed astronomical calculations; it had a manual crank control with a rate of one turn per day, forward or backward in time; its user interface presented calendrical, solar, lunar, and planetary positions.1

The Antikythera Mechanism


The Antikythera Mechanism exhibit at the National Archaeological Museum of Athens.

(Photo by Tilemahos Efthimiadis. CC-BY-2.0 license.)

The Antikythera Mechanism persists through time as a collection of artifacts and a model of intellectual achievement. Thought to have been constructed by Archimedes at Syracuse or by Posidonius at Rhodes, the mechanism was recovered from a ship wreck near the Greek island of Antikythera in 1900-1. The significance of the find only began to become apparent in the 1970s when researchers applied modern scanning technology.2

What is being organized? The Antikythera Mechanism was an arrangement of resource descriptions that represented a classical Alexandrian sol-lunar calendar, complete with an almanac of the positions of the sun, moon, known planets, and specific stars over time. The resource descriptions are represented simply by the measurements of the gears, and the corresponding information that is displayed on the front and rear panels, based on the position of those gears. These resource descriptions accounted for the range of known astronomical phenomena at the time.3

The organization of the mechanism consists of a main solar gear connected to a hand crank and a collection of gear trains that ultimately control the rotation of pointers indicating the calendar, lunar position and phases, the position of the sun and of all the known planets, and the nearest eclipse. The mechanism was housed in a wood frame box with bronze panels whose physicality was obviously intrinsic to the use of the device; the panels the back door was inscribed with what seems to be a user’s guide.

The Antikythera Mechanism calculated the position of the moon by employing five gear trains to take into account the Saros, Metonic, Callipic, and Exeligmos cycles. Thus, it was able to predict the dates of solar and lunar eclipses.

Today, the Antikythera is a collection of the eighty-two fragments that have been recovered from the ship wreck and sea bed, twenty three of which are evidently inscribed. The fragments have been dated to about 70 BCE based on the coincident presence of some coins from Pergamum and Ephesus that were recovered in the 1970s.

Why is it being organized? From a purely pragmatic perspective, the Antikythera Mechanism was a relatively portable computational device. that would have been used to accurately reckon a very specific calendar system, and to predict the cycles of days, months, years, and saro, as well as lunations, eclipses, and Olympic games. It would be an invaluable tool for astronomers, mathematicians, civil engineers, and cartographers of the time.

From a philosophical perspective, the Antikythera Mechanism was built to prove that it could be done. It represents a fulfillment of Aristotelian thought. Through the ages, the lure of scientific answers to the mathematical riddles presented among the patterns in the heavens has challenged our burgeoning intellects. The Antikythera Mechanism realized then-modern thinking on mathematics, engineering, astronomy and calendrical calculation in a portable mechanical computational device.4

How much is it being organized? Some of the major fragments are on display at the National Archaeological Museum of Athens; the others are stored.

The Gears


Gear arrangement.

(Wiki Commons.)

The Antikythera Mechanism is reported to have had about thirty gears within a frame whose size was less than the volume of a large book. The level of miniaturization and the precision of fabrication was not thereafter seen until the next millennium. The engineering and machining would have required trial models, accurate plans, and custom tooling. There have been various modern attempts to re-create the Antikythera Mechanism, or at least to re-create the model it seems to have manifested.5

When is it being organized? The person who operates the mechanism turns a hand-operated crank to establish a date, or contra-wise confirms the current date by taking sightings and comparing with the dial settings. The front face offers a solar-lunar calendar dial, a tropical zodiac dial, and an almanac dial with rising and setting times of various stars. The rear panel offers dials representing the five lunar cycles.

The Antikythera Mechanism


A recreation of the Antikythera Mechanism on display at the National Archaeological Museum of Athens.

(Photo by Tilemahos Efthimiadis. CC-BY-2.0 license.)

The organization of the engineering data required to build, operate, and maintain the Antikythera Mechanism is staggering to imagine, yet it pales in comparison to the organization required to collect and archive astronomical sightings on clay tablets for hundreds of years.6 (See the sidebar, A Cuneiform Document at the Pergamon.)

The organization of the fragments of the Antikythera Mechanism is in the hands of the Bronze Collection of the National Archeological Museum in Athens.

How or by whom is it being organized? Ancient Chaldean, Greek, and Roman astronomers and engineers; modern divers, marine archaeologists, curators and researchers. In 1978, Jacques Cousteau led an expedition to the sea bed and returned some historical artifacts, that, while unrelated to the Antikythera Mechanism itself, provide additional historical context and may help date the discovery.

The Antikythera Mechanism Research Project is a collaboration of academic, industrial, and scientific researchers, who are applying some of the world’s most advanced technology to study the capabilities and applications of the Antikythera Mechanism, as well as its historical context and significance.

Other considerations. From the perspective of one ship’s unlucky captain and crew, the Antikythera Mechanism was likely just a piece of cargo, although it may have accompanied an equally unlucky passenger carrying the world’s first computer to Caesar’s court in Rome. It remains unknown how or why the device was aboard the ship or what fate befell it, but that is a story for researchers and historians to uncover in the fulness of time.

Notes: The following notes relate to this case study.

PBS aired Ancient Computer on April 3, 2013. The BBC aired Ancient Moon ‘computer’ revisited

The Antikythera Mechanism Research Project recently published The Inscriptions of the Antikythera Mechanism. 2016. Y. Bitsakis, M.G.Edmunds, A. Jones, et alia Almagest 7-1, May 2016

Cicero wrote about a similar device, created by Archimedes, in M. Tvlli Ciceronis de Republica Liber Primvs

Gears from the Greeks. The Antikythera Mechanism: A Calendar Computer from ca. 80 B. C. Derek de Solla Price Transactions of the American Philosophical Society New Series, Vol. 64, No. 7 (1974), pp. 1-70

Aristotle’s work on the subject On the Heavens (c 350 BCE) avers to the mathematical symmetry and perfection in the travels of the spheres, envisioning cycles and epicycles in motion.

In 343 BCE, Aristotle was head of the Macedonian Academy, where he tutored Alexander and his future general, Soter Ptolemy. Following Alexander‘s conquest of Babylon in 331 BCE, he ordered Kallisthenes to organize the translation of all historical astronomical observations, initiating the transfer of the world’s greatest collection of astronomical observations, dating back to 747 BCE. Within a year, Callippus had developed a new calendar, designating the summer solstice of 330 BCE as an epoch for astronomers and calendrical calculation. The Callipic cycle of 76 years less a day, equates to 27,759 days, and 940 lunations, is represented in the gearing of the mechanism.

Ptolemy established his capital at Alexandria and founded a museum, spawning the need for a library, in the Platonic style. His successors, through to Cleopatra, added to the papyrus rolls. Mathematicians, astronomers, mechanical engineers, scientists; the most famous thinkers of the ancient world studied in the halls of the Library at Alexandria. Notable to us in this context are Euclid, Archimedes, Eratosthenes, Hipparchus, Aristarchus, and Posidonius.

According to Pliny, the calendar reform of Julius Caesar, was assisted by Cleopatra‘s astronomer, Sosigenes, of Alexandria, who “brought the separate years back into conformity with the course of the sun.”

In 2010, Andrew Carol built a Lego model of the Antikythera Mechanism on a dare. John Pavlus wrote and directed a short film, Behind the Scenes: Lego Antikythera Mechanism.

Hublot, the Swiss maker of luxury time pieces, created a special edition Antikythera Watch. Hublot is also sponsoring ongoing research. See Return to Antikythera: A project of the Hellenic Ministry of Culture and Sports with support from the Woods Hole Oceanographic Institution

A simulation of the Antikythera Mechanism is available as an open source application on Github.

The Antikythera Mechanism Research Project maintains a list of

Solid Models of the Antikythera Mechanism.

In his Almagest, Claudius Ptolemy marks the beginning of an epoch in recorded time, 1 Thoth 1 Nabonassar, with the coincident occurrence of a solar eclipse and the ascension of the Chaldean, King. Nabonassar in 747 BCE. (See the Almagest Ephemeris.) Nabonassar’s calendar reform began a period of seven hundred years of meticulous record keeping, indexing, summarizing, and studying. The scientific study of astronomy based upon recorded observation is thought to have begun with Nabonassar. When we talk about the discipline of organizing, we can tip our hats to Nabonassar.

John M. Steele (2000). Observations and Predictions of Eclipse Times by Early Astronomers. Kluwer Academic Publications. pp. 43–45.

The British Museum stores the “Babylonian astronomical diaries,” a highly systematized collection of ancient cuneiform texts that record periodic astronomical events, commodity prices and weather conditions over a period extending from 652 BCE to the 1st century BCE.

Aaboe, Asger. The culture of Babylonia: Babylonian mathematics, astrology, and astronomy. The Assyrian and Babylonian Empires and other States of the Near East, from the Eighth to the Sixth Centuries B.C.E Eds. John Boardman, I. E. S. Edwards, N. G. L. Hammond, E. Sollberger and C. B. F. Walker. Cambridge University Press, (1991)

Related Readings. See the section called “Resources over Time”

Autonomous Cars

By Jason Danker, December 2015.

Overview. Automation in cars is nothing new. Automatic transmissions and cruise control have been around since 1939 and 1958 respectively, but these systems serve to aid, rather than replace, human drivers. What is new is a near future potential for fully autonomous cars, cars that are capable of full operation without an attending human driver.

While other vehicles, such as light rail and monorail trains, have been capable of fully automatic operation since 1967, these vehicles have the luxury of operating in closed environments and only need to be able to respond to a defined set of inputs. Autonomous cars do not have this luxury. In operating “in the wild,” the systems guiding these cars may be forced to respond to any number of unanticipated situations. As the automation system cannot enumerate all possible situations, it must instead rely on continuous organization of its operating environment.

This is clearly a technical challenge, but it also raises ethical and legal issues. As autonomous cars act based on the organization of sensory inputs, the organizing systems are necessarily developed relative to ethical considerations, whether intentional or not. At the most basic level, the organizing system will direct the autonomous car in making decisions analogous to those posited in the trolley problem, a famous thought experiment in ethics that forces a choice between saving five endangered people or taking the life of an innocent person who had not been in danger. Beyond ethics, autonomous cars also raise legal questions: if an autonomous car crashes, who is liable for the damages?

What is being organized? An autonomous car will organize information about the car itself, the objects in its vicinity, and environmental conditions. The car must keep track of its movements, those of other objects, and the relative positions of itself and the other objects. It must organize this information within the environmental framework of lane markings, speed limits, road signs, traffic signals, weather and traffic conditions, and numerous other constraints. As autonomous cars become common, the cars will likely communicate with one another and this information will also need to be brought into the organizing system. The car will also need to organize, and likely prioritize, inputs from human occupants. Regardless of the exact implementation, the organizing system will necessarily limit what is worthy of organization: it is likely not possible, or desirable, to keep track of every insect in the vicinity of the car.

Why is it being organized? The car organizes its surroundings in order to safely navigate to a destination. While this is the primary interaction enabled by the organization, countless other interactions support this primary interaction. The supporting interactions fall into the two categories of prediction and reaction. The systems being developed by Google use the information that has been organized to predict what is most likely to happen next: “It predicts that the cyclist will ride by and the pedestrian will cross the street.” The systems that have been launched by Tesla tend to be more reactionary: “Side Collision Warning further enhances Model S’s active safety capabilities by sensing range and alerting drivers to objects, such as cars, that are too close to the side of Model S.”

How much is it being organized? The extent of organization varies based on the implementation. While Google uses on-board sensors and extremely detailed street maps to implement self-driving functionality, Tesla’s Autopilot relies on-board sensors and standard GPS data. While the exact extent of the organization is not publicly available information, Google has publicly stated “the system is engineered to work hardest to avoid vulnerable road users (think pedestrians and cyclists), then other vehicles on the road, and lastly avoid things that don’t move.” Given this, Google’s categories, and their hierarchy, appear to be defined by their vulnerability.

When is it being organized? For information gathered by on-board sensors, organization takes place as objects enter and leave the vicinity of the autonomous car. The organization is ongoing as the car’s surrounding and environment are constantly changing. In addition to the sensor data, autonomous cars also rely on map data which is organized in advance. Google’s cars rely on specialized, highly detailed maps that are being developed as part of the self-driving car project and, as such, are unable to drive on roads that have not yet been mapped to the necessary level of detail. While Tesla’s Autopilot also relies on maps, it uses standard GPS maps and is not similarly restricted.

How or by whom is it being organized? The car’s computational processes are responsible for the organization. That said, the car is restricted to organizing within the organizing system implemented by the manufacturer. While Google and Tesla are two of the main companies in this space, many traditional automotive companies are also developing autonomous systems.

Where is it being organized? Except for map data, the organization takes place within the car’s onboard systems. The organization must take place in the car itself due to the potential catastrophic consequences of a lag in information flow. Additionally, ensuring all organization takes place within the car provides greater security: a self-contained car is less susceptible to attack than a network dependent one.

Other considerationsWhile it is likely that fully autonomous cars will be technologically feasible within a few years, the cars may still require human interactions for legal reasons. This is clearly seen in Tesla’s press release for Autopilot: “The driver is still responsible for, and ultimately in control of, the car.” This human-in-the-loop design principle creates a legal buffer for autonomous car manufacturers by treating the “driver” as a “liability sponge” or “moral crumple zone.” As articulated by Madeleine Elish and Tim Hwang, “the human in an autonomous system may become simply a component—accidentally or intentionally—that is intended to bear the brunt of the moral and legal penalties when the overall system fails.”

While these issues will ultimately play out through a combination of court rulings and policy decisions, it is interesting to note that there is legal precedent that could either blame, or exonerate, the “driver” of an autonomous car. Drawing parallels to aviation automation, precedent suggests that the human “driver” will be held responsible for liability claims arising from the operation of the car. On the other hand, product liability law offers recourse for consumers when a company’s products fail. Many people have argued that this existing legal framework is sufficient to handle the liability issues brought up by autonomous vehicles.

Regardless of the legal complexities that will arise from specific incidents, autonomous cars have great potential to reduce car crashes and improve overall road safety. The promise of the autonomous technology, even for partially autonomous systems, is so great that the National Highway Traffic Safety Administration is proposing updates to its safety ratings that will penalize manufacturers that don’t include autonomous technologies in their vehicles.

IP Addressing in the Global Internet

By Andrew McConachie, December 2013.

Overview. Most people take for granted that the Internet just works. They connect their computer to the Internet, it gets an IP address, and they are able to communicate with a computer with a different IP address on the other side of the planet. How did their computer get the correct IP address? How does any computer or router get the correct IP address? How did the routers and other computers on the Internet get their IP addresses? Who decides which computers and which routers get which IP addresses?

What is being organized? At their simplest, an IPv4 address is a 32-bit series of 0′s and 1′s. They are resources that are born-digital, as they have no canonical physical representation. Their digital canonical representation, with which we have all become familiar, is called the “dotted quad” format and is 4 numbers between 0-255 separated with dots. For example, is the IPv4 address for www.berkeley.edu.

Not all IP addresses are of equivalent classes. There are unicast, multicast, broadcast, and experimental IPv4 addresses, and unicast addresses can be either public or private. There are also two different versions of IP addresses currently in use on the Internet, IPv4 and IPv6. We will focus on IPv4 unicast public IP addresses, since these are not only the most common, but also the most important. This is roughly the range of IP addresses from to, with some breaks in the middle for private IP address space.

Why is it being organized? IP addresses are the foundation of network connectivity and the Internet; they identify each device on a computer network and also serve as its address, so that routers and other devices can locate and communicate with it. You cannot get online without one. IP addresses can be represented into blocks, or subnetworks, using a prefix and a mask. For example, represents all IP addresses in the range of – Internet routers do not have enough memory to hold routes for every individual IP address on the Internet. So by organizing the Internet into subnetworks based loosely on a hierarchical model, routers are able to determine the correct path for every destination in the network without actually storing every address in their memory. If the organization of IP addresses is not handled properly, Internet routers would exhaust their memory space and parts of the Internet would become unreachable.

How much is it being organized? Currently there is too much granularity in the global Internet routing table. For a router it takes the same amount of memory to store a subnetwork with 255 IP addresses as it does to store a subnetwork with 65536 addresses. So if our main concern is to minimize memory usage in Internet routers, thereby lowering operator costs and increasing stability, we want as little granularity as possible in the Internet routing table. The problem is that many organizations use non-contiguous IP subnetworks that cannot be aggregated into larger subnetworks. This results in routers having to store many small subnetworks instead of fewer larger subnetworks, which will eventually lead to memory exhaustion in older routers and possible reachability issues. Currently the full Internet routing table is approaching 500,000 routes. Most network engineers expect problems once the routing table grows past 512,000 entries, since router memory limitations are always at bit boundaries.

When is it being organized? IP addresses are organized once someone configures one on a device or sets up a Dynamic Host Configuration Protocol(DHCP) server. If an organization exhausts their supply of free IP addresses. it will have to make a request to the upstream provider or Regional Internet Registry(RIR) for more address space. In the early days of the Internet, large blocks of IP addresses were given to organizations, but this led to many of the addresses in these blocks not being used. We are now reaching a point where we no longer have new addresses to assign to organizations.

Markets are now emerging for organizations to buy and sell IP addresses, and the organizations who have held on to large amounts of unused addressing space stand to make significant revenue from selling their unused space. When these organizations sell their unused IP address space, they will break up large allocations into smaller subnetworks, thereby increasing granularity and further accelerating the growth of the Internet routing table.

How or by whom is it being organized? The Internet Corporation for Assigned Names and Numbers(ICANN) is currently responsible for initial allocation of IP addresses. They allocate 8 blocks of IP addresses to RIRs, who are then responsible for distributing allocations to organizations that request them. These organizations can then allocate IP addresses to smaller organizations, thus forming a loose hierarchy of organizations, where each level lower in the hierarchy receives a subset of the IP address space from the organization above it. ICANN no longer has any /8 blocks of IP addresses left to allocate to RIRs. Once all of the RIRs have exhausted their last allocations from ICANN, organizations will have to rely on secondary markets to increase their IP address space.

Other Considerations. The world of IP addressing will soon get a lot more interesting. The introduction of IPv6 as a replacement for IPv4 has been slow in coming and, while gathering momentum, continues at a snail’s pace. As organizations start purchasing IP addresses from one another, we should expect increased granularity and decreased stability in the Internet routing infrastructure. Whether or not normal Internet users notice will ultimately be determined by how well equipment vendors and engineers expediently address the coming problems.

The Art Genome Project

By David Eicke, December 2014.

What is being organized? Artsy.net carries the ambitious mission of making “all the world’s art” accessible to anyone with an Internet connection. This is not only challenging purely from a scale perspective, with the number of artworks in the world daunting even if it were not being incremented constantly, but it is also challenging in that “art” is a nebulous term. Creators of music and literature often refer to themselves and each other as “artists.” The same goes for dancers and other performers. Will their works be included? The current collection seems to be mostly visual art, with some architecture and design objects included.

Artsy’s mission is to be carried out by their Art Genome Project, which is the organizational engine that powers their search and interactions. The name was inspired by Pandora’s project, as was their term for their organizing process: “genoming.” Genoming is not yet automated and still costly, so Artsy selects the art that is to be “genomed” carefully. Their first priority is the works featured in galleries with whom Artsy has contracts. Galleries pay to have their work organized and searchable on the site. Those works, then, must be genomed quickly in order to keep the company running. Artsy’s engine also takes in works from museums and other institutions who do not have contracts with them, but many of those institutions have image-rights concerns, and not all their artworks can be published. In other cases, the images of the works are simply too low-quality to be displayed.

Why is it being organized? Why organize art? The simplest answer is to educate. That said, art has been being organized into movements and -isms for a very long time. The Getty Foundation even created an authoritative art vocabulary called the Categories for the Description of Works of Art a few decades ago. At first glance, Artsy seems to be reinventing the wheel. However, the organizing system Artsy uses is unique in that it facilitates a special kind of interaction with its body of published works.

The way resources are organized on Artsy is a cross between a hierarchical structure and a graph structure. They have over 1,000 characteristics (which they call “genes”) to describe their resources. These characteristics can have to do with art movements, formal qualities, techniques, subject, etc. The emphasis here, however, is on relationships between works of art. For example, one of the genes Artsy uses is “eye-contact,” and if you have a photo taken last month where the subject is looking directly into the camera and an oil painting from hundreds of years ago where the subject’s eyes are looking at the painter, those two can be one click away from each other. No other organizing system could facilitate that sort of easy link between two such disparate works.

This free-flowing linkage between works enables the “berry-picking” model of knowledge seeking, where a user searching for something doesn’t necessarily have to know what he or she is searching for. A user could begin her exploration with only a vague notion that she enjoys this long-legged rhinoceros sculpture by Salvador Dali. She may not know what she likes about it, but she will see his other work there. Maybe she finds a painting she likes in the “other works by Dali” section, and she clicks on it. Then the characteristics of this painting are listed in the interface, and she is free to click on any one of them. She might click on “Surrealism” and find more works from that movement. She may click on “waterscapes” and find other oceanic imagery. She is free to explore and discover art in a self-directed way and free to discover what she likes and why she likes it. The director of Artsy’s Art Genome Project says the system was intended to parallel a professor who is adept at “riffing” on things.

How much is it being organized? As mentioned above, Artsy currently uses over 1,000 characteristics (“genes”) to describe its resources. These characteristics can describe anything from the art’s form to the art’s subject to the technique used to create the art. Experts assign these genes to the artworks and then assign those genes a weight from 0 to 100, depending on the salience of the characteristic within the work. Aside from the genes, the art is described in terms of physical dimensions (how much space it takes up), whether it has been sold or not, its gallery, its price (if for sale), its creation date, and, of course, who created it. Having such a rich set of descriptions has allowed Artsy to create a public API for developers to use all of this information as they see fit.

When is it being organized? Description of Artsy’s resources is an ongoing process. Their ingested collection of art is much larger than their published collection. Most of the artworks are waiting to be genomed, with some of them waiting for permissions or image-rights paperwork to process. Another factor in determining when something is organized is the signing of new contracts with galleries. Works from galleries with contracts have first priority, and Artsy experts genome those works as they come in.

While these experts are assigning genes on a rolling basis, they are also drawing upon hundreds of years of art history scholarship when assigning them. For example, the Arsty experts did not come up with Dadaism as an organizational concept. So, in a way, some of these works were organized long ago.

How or by whom is it being organized? Artsy has a team of art historians and experts working to describe the resources that Artsy has ingested (and those that it will ingest). They have done some experiments with image-recognition software, but its descriptions are simply not rich enough to facilitate the sorts of interactions the organization is trying to facilitate. The strategy of employing experts has its obvious downsides, however. It does not scale well, and it is reminiscent of Yahoo’s early strategy of employing librarians to describe web content. There will also be inevitable biases in human resource description.

Other considerations. With such a grand ambition, one thing that may stand in Artsy’s way of becoming an authoritative organizing system in the art space is that they are for-profit. Even if they are able to avoid too much bias in the interest of revenue generation, the perception remains that they are less interested in classifying art for educational purposes and more interested in making money.

Making a Documentary Film

By Suhaib Syed, December 2013.

Overview. As part of a small crew, I was in pursuit of making a documentary film shedding light on the problems in the higher education system in India. We had traveled far and wide, capturing many thought-provoking stories, illuminating interviews, and shocking truths. Due to the relatively small crew and a tight schedule, we ended up with our raw footage being labeled in a generic format (MVI_1234 etc.). I, being the director, had the task of assisting the editor in renaming and reorganizing the files to make our lives easier, do justice to all the efforts that were put into capturing all the clips, and incorporate them in an impactful manner.

What is being organized? The primary resources being organized were the video clips (digital, shot on DSLRs) acquired during the shoot. In this context, they could be classified as passive resources having no real capability to produce any significant value on their own, and which had to be acted upon or interacted with to produce any effect. But the key problem here was to formulate usable resource descriptions based on the following resource properties:

Intrinsic static

Date and time of creation, duration of the clip, type of external lighting used, camera used, lens used, exposure, ISO, white balance, frame rate, compression type

Extrinsic static

Shot sequence number (assigned to each story element during story-boarding), shot movement type (dolly, follow focus, zoom, macro, etc.)

During this particular stage, the intrinsic and extrinsic dynamic properties did not play a large role in the resource descriptions.

We had done a lot of work on story-boarding and identified the right level of granularity so that we could capture each shot sequence separately, so we directly used the shot sequence number as an important part of the resource description. This helped us keeping our descriptions short and meaningful.

Additionally, we realized that the corresponding audio clips captured along with the video also had to be organized, but since the two were intricately linked to each other we decided to use the same name as the corresponding video clip, the only difference being the extension. We relied on the editing software to capture the intrinsic static properties of the audio files (e.g., bit rate and compression type).

Why is it being organized? Essentially, we were organizing these digital resources to find, identify, and select them so as to weave a powerful narrative enabling us to convey the truth in an impactful manner.

Hence, the interactions were directly with the primary resource.

The interactions that had to be supported by our organization scheme involved:

Finding the clips related to a particular story-board section

Selecting the best set of clips to be included in the film based on relevance to story, progression, continuation and several other inter-connected factors

Manipulating the clip (i.e., color-correcting, white balancing, and stabilizing) to create an aesthetic effect

Matching the video of a clip to corresponding audio recording

Adding the right background score based on sentiment being portrayed in the clips and the progression of the story

Providing subtitles in case of a foreign language or incoherent speech

How much is it being organized? Since the scope and size of our organizing system was relatively limited and all the resources were already available, we were able to make some bold decisions without causing a lot of problems. We formed a controlled, vertical vocabulary for resource description by deliberately choosing certain resource properties over others. Our main objective was to keep the description as short as possible and at the same time convey the most valuable information that would help us interact with the resources (i.e., the video clips).

We could have easily opted for a date- and time-stamp based id and every resource in a collection (i.e., clips specific to one camera) would have a unique identifier, but we realized that our cameras already attached this information to the file along with the technical details like frame rate, aperture, shutter speed, ISO, and white balance, which our operating system and editing software could easily capture, display, and search through, hence, we decided not to use these details.

We also decided not to include important lighting condition properties (kino-flo, LeikoLite, etc.) and location, because the first frame in most of our clips consisted of the clap-board which contained all of this information, and our editing software showed all the video files as thumbnails using first frame of the video.

Thus we leveraged all of these to form a controlled vocabulary that placed the shot sequence number first, followed by the take number followed by camera identifier (e.g., camA, camB, etc.). For instance: 2A_1_camB.

However, we did realize that these decisions were specific to our OS and video editing software and hence lacked interoperability.

When is it being organized? In our case, although we intended to organize the resources as soon as they were acquired, we failed and then came up with an organizing system after all the resources were acquired. We leveraged this fact to our benefit and formed a more specific description system.

How or by whom is it being organized? Ideally it is the role of the first assistant cinematographer (AC), even 2nd or 3rd AC (depending on the budget), to make sure all the file names are stored properly and all the cards properly backed up. But due to our limitations we (i.e., the director and cinematographer) collaborated to organize the set of raw footage.

Other considerations. One important consideration that we left out in the discussion was the need for certain people appearing in the documentary to have their identity hidden by means of facial blurring and voice modulation. Although we could not accommodate this interaction of identifying which clips had footage of people who did not want to reveal themselves, we could easily add the special effects over an entire sequence once all the clips were brought together.

The Dabbawalas of Mumbai

Indian Lunch Box System

By Pratibha Rathore, December 2014.

Overview. The Mumbai dabbawala tiffin service is the source of much fascination from around the world, and I am no different: I worked in Mumbai for two years and used the services of dabbawalas to get my lunch box (called a “dabba”) delivered from home to my office, which was about 44 miles away. Without the use of any technology or digital resources, this organizing system has been coordinating the delivery of home-cooked lunches to thousands of Indian office workers for over a century, charging just a small fee of $3-7 per month. The community of dabbawalas has been able to create value for its customers by optimizing and standardizing the principles of its operations and devising an organizing system that is down to earth and human-centric.

What is being organized? The primary resources in the dabbawala system are the dabbas that are delivered to respective customer’s offices and organized using a simple but effective color-coding system. The secondary resource is the workforce, consisting of 5,000–6,000 people known as dabbawalas, who organize themselves and their supporting supply chain and logistics operations to deliver the dabbas to the right location and at the right time each day without failure. The dabbawala community, called the Mumbai Tiffin Box Suppliers Association (MTBSA), follows a flat organization structure, meaning the motivation to perform consistently is a matter of personal drive and accountability.

Why is it being organized? The primary reason people use the service of the dabbawalas is to eat a proper, home-prepared meal during lunch, a way to connect with their family while busy at work. The interactions supported by the dabbawala organizing system provide two significant benefits to the customers: managing their budgets while eating healthy, and leveraging time constraints. Most of the office-goers usually leave by 7 a.m. to commute from the suburbs of Mumbai, traveling south to the main commercial area of Mumbai and returning back home after 7 p.m. The railway network during the peak hours is jam-packed with commuters hanging onto the trains with one hand; therefore, carrying one’s lunch at that time is not feasible. Most of the commuters cannot afford to eat takeout every day, and eating on the roadside is unhealthy and unhygienic. In addition, catering to the diverse food habits and taste needs of employees is very difficult for office canteens to manage. Thankfully, the dabbawala system solves all these problems with 100 percent customer satisfaction by delivering to each employee his lunch filled with food prepared at his home.

How much is it being organized? The Mumbai lunch box system is a successful and a socially sustainable enterprise. The number of dabbas delivered per day to offices and back home is around 300,000; that means 600,000 transactions per day. Although the number of transactions is very large, each person handles a small subset of transactions at a time. The scope of the organizing system and the scale of operations pretty much remain consistent, with the addition or deletion of few dabbas every month. Most interestingly, despite the lack of computers, mobile technology, or any automated processes, a dabba goes astray only once every two months, making less than one mistake in every 6 million deliveries. Now that’s efficiency! The system is able to achieve consistency in its operations because of successful implementation of several organizing principles. Firstly, containers used to house the lunch boxes are of a standard shape and size. Second, the color coding done on the dabbas takes advantage of people’s visual acuity, following a human-centric design approach. Third, the sequence of transactions to deliver each dabba from its source to destination and back to source is repeatable, predictable, systematic, and iterative in nature, enabling easy tracking and monitoring. Finally, governance within the community is achieved by instilling ethics, values, and principles in employees and by holding employees accountable at all times.

When is it being organized? The interactions between dabbawalas to deliver the dabbas follow a “hub and spoke” process model. During a dabba’s journey from kitchen to consumer, it is handled by between three and twelve different deliverymen. The typical day for a dabbawala begins at 9:30 a.m., and he spends about an hour collecting all the 25–30 dabbas from the assigned set of homes in his designated area. The households are expected to have the lunch box ready when he arrives for collection. When he is done with collection, he goes to the local train station and gathers with the other dabbawalas of his area. Next, the dabbas are sorted in the order of stops on that rail line and handed off to the dabbawala who is responsible for that particular station for delivery to their final destination. At every departure station, the dabbas are passed out according to their next destinations. The same process is repeated when returning empty dabbas back to homes.

Figure 12.3. Dabbawalla Delivery Process


A model of the dabbawalla delivery process


How or by whom is it being organized? The key to this successful delivery management system is the color coding done on the dabbas. The dabbawalas use simple design measures such as signs, different colors, numbers, dashes, dots, letters, and simple symbols to indicate various parameters such as origination suburb, route to take, destination station, who is responsible, the street, building, floor, etc. As most of the dabbawals are illiterate, the choice of syntax for markings is done in such a way to ensure it is easy to understand and implement. The vocabulary used to implement and describe markings on the dabbas follows a standardized and self-descriptive process, thereby eliminating ambiguity and variability and making the organizing system more effective. Since only numbers and letters are used, the syntax for description of the primary resource (dabbas) is intentionally made to be independent of any local language, so that everyone can learn, understand, and process without any confusion, bias, or information overload.

Figure 12.4. Dabba Routing Codes


A breakdown of the coding system used to identify and route a dabba.


At each stage of the process, only one part of this code needs to be read, which works as a signal and thus allows picking up the right dabbas very quickly. It is also particularly efficient for traceability, since any dabbawala seeing a dabba knows which path it has to take. In case a dabba is lost or forbidden somewhere, any dabbawala is able to put it back on the right track. There is no need for the structure of color coding to be more granular than described above, as dabbawalas know the collection areas by heart. Furthermore, the process of adding a new resource to the organizing system is straightforward and structured. If a new resource—that is, a new customer—is added to the system, the dabbawala will do the complete journey to check the address of delivery and coordinate with other colleagues in the community to see who has a free place in his crate to add one more dabba. Once the sequence of delivery has been established and all the necessary stops for exchange decided, the address on the dabba is marked and it becomes part of the whole system.

Other considerations. It would be interesting to know if this delivery model could be used by other cities as the problem of longer commute and need for homemade food for lunch by office workers is always there in major cities. In my view, standardization of operations and understanding cultural and regional biases can provide opportunities for other cities to implement this model, at the same time providing jobs to many semi-skilled workforces.

Managing Information About Data Center Resources

By Hassan Jannah, December 2013.

Overview. Nowadays, there is an app for almost everything! Yet, we show little or no regard to what happens behind our shiny little screen until something breaks down and our lives descend to near chaos. That is the conundrum of IT guys. The truth is that IT solutions are, in many cases, fragile things that need constant care. This is no easy task. In fact, most of the cost and effort involved in IT solutions is maintenance. A million things could go wrong. Words like preventive maintenance, service monitoring, business continuity, and disaster recovery are examples of the different activities done to maximize availability, and expedite troubleshooting. Everyone involved with these activities needs access to resources. Above all, they all need access to information.

What is being organized? IT data centers have both physical and digital resources. Physical resources include the facility (i.e., building), utilities, computer hardware (e.g., network switches, cables, servers, storage, etc.), and, also people. Digital resources are much fuzzier to define. A simplistic approach could classify them into data and applications. Each category can be further sub-classified into an entire ontology. The complexity increases when you consider the great number of potential resource types that can be created by combining physical and digital resources. Capturing, storing, and maintaining information about these resources is a big challenge. A lot of information can be retrieved from the resources themselves. Usually, each team responsible for supporting a certain group of resources would store information in spreadsheets and documents. More organized teams would use databases or knowledge management systems. More diligent organizations would have a central repository for everything.

What many fail to capture is the information about how all of these different clusters of resources are interconnected. That is often a much bigger and complex challenge. That information could be either buried deep in these systems (e.g., the user name used to run a certain service), or is stored in people’s brains. The added value of an organizing system for data about data center resources can be multiplied if effectively organized information about their interactions.

Why is it being organized? Running an IT data center is complex, resource intensive, and risky. Customers require around the clock availability of services with no room for failure. The consequences of such failures go beyond financial loss and customer dissatisfaction. They could affect people’s safety and, even, national security. Cyber threats have become a constant threat for IT service providers, especially those that host highly sensitive data or serve critical operations. People can survive if their emails were inaccessible for an hour. However, what are the ramifications of a total failure of the IT infrastructure of the New York Stock Exchange? What if the airport systems of Heathrow airport failed? These are some of the conditions that IT data center managers must work in. Furthermore, technology advances have created highly diverse, complex, and integrated solutions. New resources are introduced frequently as old resources are retired. These activities require careful planning and execution to prevent the intricate eco-system from crashing. Having all the information required to plan these activities would mitigate that risk.

Nevertheless, when something wrong does happen, having the required information is equally important to expedite fixing it. In fact, availability of information increases with the severity of the problem. How can you rebuild a system if you do not know how to connect its parts? How much are the resources organized? The granularity of the data required about data center resources varies between organizations and also between stakeholders of the same organization. The information can be classified into operational, and planning information.

How much is it being organized? Operational information is required for running day-to-day operations. These include information about resources and how they are interconnected. Many organizations put most of their focus on organizing operational information with high granularity. The granularity could be influenced by economic, political, an intellectual factors. Higher granularity means that more time and money are required to organize the information.

The level of granularity used to describe a resource type can be driven by the motives of the team leading the activity. For example, a hardware systems support team would invest more in building a robust organizing system for hardware systems and not focus on applications running on that hardware. Finally, the team’s intellectual abilities and knowledge would influence the granularity of the system. As the boundaries between physical and digital resources fade, system designers could face some challenging questions. For example, servers are, traditionally, considered hardware resources. However, many organization have switched to virtual servers running on big machines. In such a case, how would you define a server? Is it the big machine or the individual virtual servers? Is it a physical resource or a digital resource? If you have a standby clone of a virtual server, would you consider both to be the same entity or not?

Planning information is usually required to make business decisions and is usually less granular. This could include information about the purchase and maintenance costs, contracts, hardware life-times …etc. Managers and planners could use this information to better plan for business activities, manage operational and capital costs, and make strategic decisions about the services and products the data center offers.

When is it being organized? Many data centers start building an organizing system of data about their resources based on existing resources. In such cases, building the system is the easy part. The real challenge is maintaining the information up-to-date in an ever-changing environment. Clear information life-cycle and change management processes are required in parallel with work processes to ensure information is updated.

How or by whom is it being organized? Based on the scope and level of granularity of the system, the number of resources could potentially be gargantuan. The organization must try to maximize the amount of information collected automatically using auto discovery “agents” to keep update information. Inevitably, other information, especially information describing interdependencies, will require human entry. The organization must have a clear and comprehensive governance framework that details the roles and responsibilities of different parties in adding, and maintaining information.

Other considerations. Most big companies in the past operated their own corporate data centers. Their organizing system might have a smaller scope. The emergence of global cloud service providers has extended the commoditization of IT products and services across the entire technology landscape; from the consumers all the way back to the servers that provide them. These providers will have a bigger scope due to the diversity and dynamic provisioning of their services.

Neuroscience Lab

By Colin Gerber, December 2013.

Overview. A neuroscience lab is doing Parkinson’s disease research in which they do experiments with rats. They use different types of rats, surgeries, and drugs for experiments and have to keep track of all this information for data analysis, publications, and lab inspectors.

The existing organizing system was developed before personal computers were prevalent and has slowly evolved over time. However, much of the underlying structure of the system still has its roots in pre-computer concepts. In order to update the system to incorporate more modern technologies what are the changes to the resources, their descriptions, and the systems structure that need to be made?

What is being organized? Resources in the current organizing system include rats, surgeries, experiments, drugs, and data recorded from the experiments. There are some other resources that could be incorporated into the organizing system.

Neuroscience Research Equipment


Physical resources in the author’s lab are arranged to facilitate the precise accuracy of interactions required in medical research. In this photo, an array of amplifiers and filters for processing and recording rats’ brainwave signals (left) is installed in a vertical rack that can be located close to the equipment used to perform surgeries.

(Photo by Colin Gerber. Used with permission.)

One such new resource is surgery techniques. Surgery techniques have historically been passed down by the master apprentice method and information was largely tacit knowledge that was held by the researchers performing the surgeries and not explicitly in the system. This was done because it is inherently difficult to store the intricacies of surgery in text and even more difficult for a new researcher to learn how to perform the surgery from textual information. The ability to store and annotate multimedia changes this however. It is now possible to make instructional videos for each type of surgery, add resource descriptions to the video file and store it in the organizing system.

There is also a resource that is treated as one resource through its entire lifetime when it may actually be two. When rats are originally brought into the organizing system they are treated as a manifestation of the rat resource type. Meaning the rats are interchangeable, you can use any rat from that group in your surgery. Once the surgery has been performed the rat is modified into a new resource instance. The specific rat the surgery was performed on now has a new set of resource descriptions.

Why is it being organized? Is the main purpose of the organization system to make sure the correct rats are used in each different experiment? Or is it to make sure the records are kept up to date for the lab inspectors? It could also be making data analysis and paper writing more efficient. These decisions will affect how many different types of resource descriptions are required and the granularity needed for those descriptions.

This system is just one of many organizing systems within a lab so deciding the scope and interactions it will have with the other organizing systems is very important. One important decision is if the system will support the training of new members of the lab or not. Having resources such as video recording of surgeries and experiments could enable teaching interactions for new researchers. But there are many other aspects of training a new researcher must go through, should these also be included in the organizing system? If so, it would make the system much more complex and expand the scope of the organizing system outside of surgeries and experiments but would keep all of the teaching resource in one system.

Another option would be to have a separate organizing system that is responsible for training material which is able to interact with the multimedia in the system that are relevant to training. This does not expand the scope of the system but would make the maintenance of it more difficult. Each time a surgery technique or experiment is changed two systems would have to be updated to take the changes into account.

How much is it being organized? The system is accessed by many types of users, each requiring a different type of interaction. The researchers need to search for the correct rat and surgery technique. The lab inspector needs to check for drug logs and make sure all the surgery methods and equipment are up to date. The principal investigator needs to see an overview of progress on projects.

Currently the system is organized in hierarchical categories where the top-level categories are surgery and experiments. This organization makes it easy to retrieve specific resources. However, the interactions normally performed with the system use resources from both sub-trees, which makes the hierarchical approach less than optimal.

A faceted classification approach could work well to enable these interactions. The facets would incorporate the original categories of surgery and experiments but also add facets for each common type of interaction. In this case different resource descriptions of the same resource will often be classified into different facets. These resource descriptions will often act as resources themselves. For example, a lab inspector is interested in retrieving the expiration date and times a drug was used in surgery, not the drug itself.

When is it being organized? In a neuroscience lab resource descriptions are often lost if they are not recorded at the time they are measured. For example, if a rat is weighed to calculate the correct dosage of a drug, both the dosage and the weight should be entered into the system. If the weight is not entered at the time of measurement it would be impossible to weigh the rat later and get the same result (as the rat changes weight over time.) This is a common problem, so as a rule all resources and descriptions should be entered into the system at the time they are acquired.

How or by whom is it being organized? The researchers working in the lab do all of the organizing. They are the ones creating new resources, descriptions and have the most knowledge about the resources and how they relate to each other.

Other considerations. Changing the system and entering all of the data at the time of measurement will initially cause more work for the researchers but will result in more accuracy for the interactions supported by the system and less retrieval work during data analysis and paper writing.

A Nonprofit Book Publisher

By Emily Paul, December 2014.

Overview. The New Press, a nonprofit book publisher with approximately 1,000 published titles, roughly 800 of which are actively in print and featured on the website, updated its book categories for use on thenewpress.com as part of a website redesign. Rather than fully adhering to an established book classification system, such as BISAC, which is commonly used in book retail, The New Press developed its own classification system. In addition to the standard goal of allowing readers to browse categories, this classification system is designed to represent the press’s focus and mission. The New Press classification system employs a mix of principles and levels of granularity while incorporating some elements of the institutional categories from BISAC.

In order to gain some insight into how these dual goals affect usability, I ran user tests on a mockup of the website with the proposed categories. I conducted a think-aloud exercise in which the users verbalized their thoughts as they browsed through the categories and subcategories. I then asked the users to walk through where they would go for a particular book in response to a prompt from me that included the book’s title, subtitle, and a brief description. Lastly, I asked the users about what their impressions were of The New Press after looking at the categories, whether they were confused by the categories, and which categories they would be interested in looking at if they visited the site.

What is being organized? The resource being organized is the digital presence of the books on thenewpress.com. The classification system is only used on The New Press website and is stored in a FileMaker database that pushes data to the website. There is already a dedicated website classification system that this new system builds on. It is worth noting that the book records in the database also contain BISAC categories. These are entered so that they can be sent out to distribution and bookseller feeds that require the industry-standard categories. The BISAC categories are institutional categories created by the Book Industry Standards Group. The BISAC system is designed to reflect the interests and understanding of general readers. As such, the BISAC categories are informed by cultural categories and also influence cultural categories because of their broad adoption in the book industry. In addition to using some institutional categories from BISAC and mainstream cultural categories, The New Press is using cultural categories from specific groups, namely academics and political progressives, to connect with specific readers.

Why is it being organized? The books are being categorized to facilitate browsing by readers and supporters on The New Press website. In addition to the primary browsing interaction, the categories are also being used as an opportunity to position The New Press and to convey a sense of its mission.

How much is it being organized? For the purposes of The New Press website, books can be placed in multiple categories and subcategories, but all books will have at least one category designation. Because The New Press is not concerned with the physical presentation of the resources, the books can be placed in as many categories as are relevant. In contrast, library and bookstore classifications need to satisfy the uniqueness principle, because the book can only be located in one physical location.

Most of the categories are based on the subject matter of the books. A book’s subject matter is an intrinsic static property because it does not change once it is published. However, the categories used to describe this subject matter may change over time as new categories are added to the classification system and retroactively assigned to previously published books. The book subject categories can generally be thought of as extrinsic and static because the threshold for changing them is higher than it is for more dynamic properties such as Current Season, Next Season, and Bestsellers. These categories are also included on the site in a separate section and are all extrinsic, dynamic properties because they are based either on time or sales, rather than intrinsic properties of the books.

The New Press classification system includes hierarchical categories, though only the subjects in which the press publishes more extensively have subcategories. In areas for which there are more books, the organization can be more granular without creating a subcategory that contains only one or a few books. Additionally, the greater institutional knowledge of the subject area enables the staff to make more specific distinctions within the broader subject category. One of the questions I explored in my user testing was whether these differentiations are necessary to support users’ interactions with the books. If the users do not share the same level of knowledge in the subject it may not be useful, and may even diminish usability, to differentiate at the level of granularity provided by the subcategories.

Even at the top category level, there is a range of granularity and also a range of principles embodied in the categories. For example, History and Immigration are both top-level categories, but Immigration covers a more specific group of topics than History does. Most categories are based on the subject of the books, but there are several top-level categories based on other principles. These include Graphic Nonfiction, which refers to format; Primary Source Documents, which refers to the source material; and Biography, which refers to the genre of the book but does not express anything about its subject matter beyond the fact that it is about someone’s life. Mixing category principles can be useful, particularly in a faceted system, which allows users to combine different categories to increase precision. In a faceted version of this system, a user could select Biography and Law in order to find biographies written about a judge or lawyer. Because books are assigned to all relevant categories in this system, this interaction is feasible at the logic level even though the current presentation does not allow it. If The New Press wanted to switch to a faceted presentation it would likely visually separate the categories into blocks based on the principles, so that users knew which facets they could pivot their searches on. This might include creating a genre section with Biography, Oral History, and Primary Source Documents as well as a geography section with the subcategories from World.

When is it being organized? Once the updated categories are finalized, all previously published books will be reviewed and assigned to new categories as necessary. Going forward, new books will be categorized on a seasonal basis and new categories may occasionally be assigned to previously published books on an ad hoc basis (this could be due to previous oversight in not assigning the category, or to the creation of a new category or subcategory). This system is flexible because books can be assigned to all relevant categories, so the introduction of a new category does not mean that all previous assignments will need to be changed. The subcategories also allow for flexibility because if one of these categories becomes more important over time, it can be changed at the presentation layer to a top-level category with minimal effort.

How or by whom is it being organized? The sales, marketing, and inventory manager assigns the categories, with input from the editorial and marketing teams. From time to time other departments, such as fundraising or publicity, may suggest a new category or category assignment for consideration. The categories are assigned in a FileMaker database in which the categories can be selected from a list of existing categories and subcategories. The category assignments in the FileMaker database are pushed to the website along with other book data.

Other considerations. Creating a classification system that can be widely understood is difficult to do. In this case, simplifying the system would support The New Press’s goal of reaching a broad audience of readers. User testing revealed that the current category system may be hindering this because of issues with semantics, granularity, and structure. The structural issues are the most important to address because the inconsistent use of subcategories generated significant confusion during the user testing. By removing the subcategories and instead allowing expert users or those who know exactly what they are looking for to use search, the press could maximize the categories’ relevance for general readers. This could be strengthened by an emphasis on using relevant keywords in the book descriptions that support searching. Despite some initial surprise from the test users about certain unusual top-level categories, I would argue that after simplifying other aspects of the system, the press could successfully keep some of these in order to represent its publishing areas and connect with like-minded readers. For example, Immigration and Criminal Justice are not top-level BISAC categories, but are easily understood by general readers and serve to highlight these important areas for The New Press. Biases in classification systems are unavoidable. While this can be negative, particularly when the organizers are not aware of the biases, it can also be harnessed positively and used to communicate a sense of the organization and its values. This needs to be approached thoughtfully and carefully and tested on users to understand how people outside the organization will interact with the system.


[646] [(Ctein 2010)] and [(Taylor 2010)] are popular guides for photo digitization and restoration.

[647] For example, http://web.appstorm.net/roundups/media-roundups/top-20-photo-storage-and-sharing-sites/ reviews 20 photo storage and sharing sites and http://photo-book-review.toptenreviews.com/ compares 10 sites for creating printed albums from digital photos in case you want to “round trip” from Grandpa’s photos and print photo books for family members.

[648] [(Herbst 2009)] is a thoughtful legal primer on the novel property, jurisdiction, and terms of service complexities in gaining access to accounts of deceased people. A popular treatment about what has come to be called the “digital afterlife” is [(Carroll and Romano 2011)].

[649] http://www.spo.berkeley.edu/guide/consultquick.html is an example of such a policy. Indeed, it is because of rules like these that the professor determined he needed to take a leave of absence from the university.

[650] For a high-level theoretical framework about capturing value from knowledge assets see [(Teece 1998)]; for a detailed case study see [(Goodwin et al. 2012)].

[651] [(Poole and Grudin 2010)].

[652] [(Hansen 2009)].

[653] [(Wakabayashi 2011)].

[654] [(Hori, Kawashima, and Yamazaki 2010)]. Fujitsu expects that the system will eventually integrate business management functions, production history, and operational support for best practices.

[655] See[(Burrell, Brooke, and Beckwith 2004)] for a study of the use of sensor networks in Oregon vineyards.

[656] [(Tagliabue 2012)]. We cannot resist describing this as “sexting” by cows.

[657] [(Wilde and Catin 2007)]. Looking back it seems ironic to start with a single-source XML publishing system, abandon it to author the book in Word, and then convert the files Word back to XML to enable single-source publishing.

[658] [(Kimber 2012)] seems destined to become the definitive resource for DITA-based publishing. The definitive source for DocBook has long been [(Walsh 2010)].


The Discipline of Organizing Copyright © by Robert J. Glushko. All Rights Reserved.

Share This Book