Views
From iPublic.org
Jump to: navigation, search

Gov 2.0 is about bringing transparency and accountability to governance through timely access to useful public information. At the District of Columbia, Gov 2.0 started in March 2004 with launch of the DCStat program. DCStat grew as an extension of the District's Geographic Information System (GIS) program, and relies upon core location-based services and content produced by the GIS unit.

From the start, DCStat's vision was publishing agency operations data and sharing that information both across District agencies and with the public. The public access goal was achieved with launch of live data feeds on the District web site June 12, 2006.

Contents

DCStat Case Study

Here, DCStat's founding program director shares knowledge gained and lessons learned with others who are seeking to stand up a similar capability. Guidance and supporting materials are provided along three subjects: 1) policy, 2) business process and 3) technology design.

Policy

In most government settings, public data are administered by stewards distributed across agencies. Appropriate access for this content ranges from wide open to highly-sensitive data protected from disclosure by law. Developing and communicating a clear, uniform policy for identifying and publishing public information is an important step.

At the District of Columbia, information that is traditionally public or obtainable via Freedom of Information Act (FOIA) requests is determined automatically appropriate for sharing over the Internet. Further, some sensitive data sets are made appropriate for distribution by removing identifying references either through obfuscation: e.g.; reporting certain crime incidents only to the 100-block level, rather than publishing a site address, or aggregation: grouping individual occurrences into sets by geographic boundaries. When disagreements arose regarding protection of privacy or other matters, a leader from the Mayor's office would hear opposing views and render a decision.

To initially coordinate the effort, the District Mayor's office issued a directive to all agencies regarding sharing public information. In addition to spelling out policy, the directive specified responsibilities of both the agency providing the source data and the DCStat team charged with publishing the content to live data feeds. The process was designed to provide agencies at least 30 days prior notice before publishing would start. This allowed time for the agency to prepare and review data sets before public release.

Business Process

The notion of providing information in 'useful' form is important when developing a data sharing program. At the business level, steps may be taken to increase data set value to consumers. In Washington, DC, a global information model provides a common reference for organizing and presenting various data sets. An information sharing lifecycle was devised to identify, refine and combine data sets to produce consistent, meaningful information for distribution. Data sharing strategy can be summed up as follows:

  1. Provide access to many information sources
  2. Organize and cross-reference this content along shared 'dimensions'
  3. Improve government operations through rational, quantifiable interpret

A systematic approach to publishing content speeds availability and keeps costs low. In Washington DC, the DCStat team consistently prepared new data content sources for Internet access within two weeks, start to finish. The article methodology for publishing agency data describes this process. A technology framework (described below) is also necessary to gain these efficiencies.

Technology Design

Extracting, transforming and loading data is a costly and tedious process if done manually. Key to providing a sustainable live data feeds program is implementing a automated data engine that:

  1. Collects source digital content: custom and commercial databases, spreadsheets, web pages, etc.
  2. Transforms and synthesizes source information into an shared information model (anchored to time, location and other common dimensions)
  3. Uses well known machine-readable forms to publish continuously updated content

The District data feeds technology uses a Service Oriented Architecture design and applies principles adopted from GIS, data warehouse, search system and other technology specialties. The DCStat Enterprise Service Bus houses the process logic and routes messages between the system's loosely-coupled web services. The article Data warehouse design for actionable government information provides an overview of the Districts database components and transformation processes.

See Also

External Links

Personal tools