It’s as simple as:
$ osm2pgsql -l -c -S default.style africa.osm.bz2 -d osm
The -l switch aks for keeping lat/long projection, -c requests creation of the schema, -d specifies the database to use. The default.style file is a configuration specifying what to import and how; I used the default for the sake of this test.
Schema | Name | Type --------+--------------------+---------- public | planet_osm_line | table public | planet_osm_point | table public | planet_osm_polygon | table public | planet_osm_roads | table
And the way we’ve been using for testing has all characters:
Let’s see it in Quantum GIS, compared with the one coming from the corrupted shapefile (which I’ve imported into postgis after hacking shp2pgsql to discard incomplete multibytes):
The difference you may notice seems to be due to left-to-right vs. right-to-left orientation of the text. My terminal seems to ignore orientation, qgis doesn’t.
Now, time to see if a shapefile will be able to bear all that UNICODE. Let’s not do anything fancy, just dump the roads table using pgsql2shp:
$ pgsql2shp osm planet_osm_roads
Pretty fast (slightly above 1 second system time, 8 secs real time). And here’s the generated shapefile dataset:
72979148 planet_osm_roads.shp 69555432 planet_osm_roads.dbf 516268 planet_osm_roads.shx 257 planet_osm_roads.prj
Do they have the full multibyte strings now ? Sure, shp2pgsql doesn’t complain anymore, and you can safely import into postgis again completing the round-trip. Only you have to specify input encoding UTF-8 as the new default encoding, as I pointed out in previous post, is that unmentionable one… So:
$ shp2pgsql -W UTF-8 planet_osm_roads planet_osm_roads_roundtrip | psql osm ... $ psql osm -c 'select name from planet_osm_roads_roundtrip where osm_id = 4005333'; Avenue des Nations Unies - شارع الأمم المتحدة
Also, we can open the shapefile itself with qgis and see how it looks:
Green is the pgsql2shp-exported shapefile, red is osm2pgsql-imported planet osm, black is geofabrik-imported shapefile.
All clean and easy 🙂
Further excercises would include tweaking the osm2pgsql style file and generally the import process to better select data of interest, properly clean geometry invalidities and taking care of incremental updates of the data.
Good luck and happy hacking !