Open Cell-ID database
In my previous post, I have shown you how to use Google Maps API to retrieve your approximate location using information from the currently connected network. In particular, the API requires the Mobile Country Code (MCC), Mobile Network Code (MNC), Location Area Code (LAC), and Cell-ID (CID) to return the approximate location. This works well but requires Internet access to use the Google API and there is no direct access to the cell-id database.
Opencellid – open source cell-id database
Luckily there is an alternative for those interested in the full cell-id database, take a look at opencellid (http://www.opencellid.org/). The project aims to build of a complete world-wide database of cell-id and its associated location information. Users can run the opencellid clients on their phone to submit their location information to the server. The average geocode associated with every cell-id submitted will be calculated and made downloadable here. There are also APIs for developers available to write their own client to access the database.
The data is in CSV text format and includes:
- mcc, mnc, lac, cellid: the cell-id information from the connected mobile network
- lon, lat: the average geocode of the above cell-id information.
- nbsamples: number of samples submitted for the given cell-id. The higher the number of samples, the more accurate the geocode information is
After spending some time converting the cell-id raw data downloaded from opencellid website into a SQL Server 2008 database and using some queries to remove apparently invalid entries (invalid MCC, MNC, LAC or CID), the following statistics are observed as at the time of writing this post:
+ Total apparently valid entries: 570296
+ Total entries for Singapore: 3232. [In Singapore, Singtel alone is estimated to have at least 10 000 cells]
+ Total number of MCCs: 182
+ Top 5 MCCs having the most entries: 262 (Germany, 79725 entries), 234 (UK, 47954 entries), 724 (Brazil, 39343 entries), 310 (USA, 33510 entries), 250 (Russia, 28533 entries)
+ Number of MCCs having fewer than 100 entries: 64
+ Number of MCCs having fewer than 1000 entries: 122
+ Number of MCCs having fewer than 5000 entries: 153
Obviously the database seems incomplete and the accuracy also has to be questioned. However, given the open nature of the project, it is hoped that the database will improve with time, in terms of the number entries and the accuracy, and eventually can be as reliable as Google’s location API.
When I open the raw data from Opencellid, cells.txt with notepad, what I can see is just some unreadable characters.How to convert the raw data in to a readable one?
I had the same problem. Turns out it was compressed twice. I first gunzipped cells.txt.gz, resulting in cells.txt which was all binary. Then I renamed cells.txt to cells.txt.gz, and gunzipped that (again). This resulted in a readable cells.txt.