Flow of Data Processing
1. Data Capture
The National Statistics Center confirms the quantity and type of survey forms and documents submitted by local governments and/or related other administrative bodies, and stores them securely in a document storage facility when they have been sorted.
In order to compile statistical data, questionnaires are read using OCR (optical character reader) or input by hand on PCs connected to an internal-only network.
Sections of questionnaires that require subjective entry, including “Job Category,” Kind of Work” and “Household Income and Expenditure” are subject to coding based on stipulated classification criteria (such as Industrial classification, Occupational Classification, and income and Expenditure Classification).
The coding system uses hundreds of different categories for the content of multiple entries. This process requires specialist knowledge regarding different industries and occupational categories as well as the ability to correctly categorize an item immediately. In recent years the Center has pursued further progress in research and development of an automatic coding system and is currently engaged in the introduction and implementation of such a system.
3. Data Editing
Based on data entered using OCR technology and coding classification, the National Statistics Center examines missing value, validity and consistency of entered data using computer system. Deficits in individual data and inconsistencies between survey items lead to a deterioration in public trust and in the quality of the statistics.
This therefore means that the Center performs extraction of these data and corrects them based on statistical theory and so on.
4. Compiling Statistical Tables
Following the end of the data editing, the “cleaned” data are compiled by computer and tabulated. Counting data involves using cross div data calculation-an estimiation method designed for each statistical survey-in addition to collecting time series data based on a seasonal adjustment method; counting using the multivariate analysis method; and error counting for estimating statistical accuracy. These are all based on statistical theory and a variety of techniques.
5. Results Examining
In addition to checking the figures and formatting of results tables tabulated from the statistical data, the Center also performs a variety of checks and verification such as the logical compliance of statistical values; time series verification of past values; comparison with related statistics; and verification of singular values. This is part of a comprehensive verification from several different angles, looking for inconsistencies in the data and ensuring the quality of the results tables.
Completed results tables are supplied via various media to the ministry or agency who implements the survey, such as the Statistics Bureau, Ministry of Internal Affairs and Communications.
The entire process involved in tabulating these tables of results (for example, monthly basis tabulation for the Labour Force Survey (such as the unemployment rate) takes approximately two weeks to complete, while tabulation of the Consumer Price Index for Japan is performed nearly one month period.