2018-09-28 10:35:17 +02:00
"cells": [
"cell_type": "markdown",
"metadata": {},
"source": [
"# Extract dialect regions from image\n",
"Using image processing, extract region polygons for the dialects depicted in this image\n",
"![dialect regions](../data/dialects.png)"
2018-09-28 10:35:17 +02:00
"cell_type": "code",
"execution_count": 1,
2018-09-28 10:35:17 +02:00
"metadata": {},
"outputs": [],
"source": [
"from math import floor\n",
2018-09-28 10:35:17 +02:00
"import json\n",
"import folium\n",
"from folium_jsbutton import JsButton\n",
"from imageio import imread\n",
2018-09-28 10:35:17 +02:00
"from collections import Counter\n",
"from math import sqrt\n",
"import numpy as np\n",
"%matplotlib notebook\n",
"from matplotlib import pyplot as plt\n",
"from skimage.morphology import binary_closing\n",
"from skimage.measure import find_contours, label"
2018-09-28 10:35:17 +02:00
"cell_type": "markdown",
2018-09-28 10:35:17 +02:00
"metadata": {},
"source": [
"# Input\n",
"Load the image and determine the used colors."
2018-09-28 10:35:17 +02:00
"cell_type": "code",
"execution_count": 2,
2018-09-28 10:35:17 +02:00
"metadata": {},
"outputs": [],
"source": [
"im = imread('../data/dialects.png')\n",
"color_occurence = Counter(map(tuple, im.reshape(-1,3)))\n",
"color_sorted_by_occurence = [c for c, _ in sorted(\n",
" color_occurence.items(),\n",
" key=lambda x: x[1],\n",
" reverse=True\n",
2018-09-28 10:35:17 +02:00
"cell_type": "markdown",
2018-09-28 10:35:17 +02:00
"metadata": {},
"source": [
"# Relevant colors\n",
"Show the most used colors and select those of the relevant (dialect) regions"
2018-09-28 10:35:17 +02:00
"cell_type": "code",
"execution_count": 3,
2018-09-28 10:35:17 +02:00
"metadata": {},
"outputs": [
"data": {
"text/plain": [
"metadata": {},
"output_type": "display_data"
"data": {
"text/html": [
"text/plain": [
"metadata": {},
"output_type": "display_data"
2018-09-28 10:35:17 +02:00
"source": [
"pallete_width = floor(sqrt(len(color_sorted_by_occurence)))\n",
"pallette = np.array(color_sorted_by_occurence)[:pallete_width*pallete_width]\n",
"pallette = pallette.reshape(pallete_width, pallete_width, 3)\n",
"_, (ax0, ax1) = plt.subplots(1,2)\n",
"for x in range(pallete_width):\n",
" for y in range(pallete_width):\n",
" ax0.text(x-0.5, y+0.5, x + y*pallete_width)\n",
"ax0.set_title('(almost) all colors')\n",
"legend_color_indices = [3, 4, 7, 8]\n",
"legend_colors = [color_sorted_by_occurence[i] for i in legend_color_indices]\n",
"pallette = np.array(legend_colors).reshape(1, len(legend_color_indices), 3)\n",
"ax1.set_title('selected colors')\n",
"regions = ['Klaaifrysk', 'Waldfrysk', 'Sudwesthoeksk', 'Noardhoeksk']"
2018-09-28 10:35:17 +02:00
"cell_type": "markdown",
2018-09-28 10:35:17 +02:00
"metadata": {},
"source": [
"# Georeferencing\n",
2018-09-28 10:35:17 +02:00
"Use folium to find the nort-east and south-west corners of the image."
2018-09-28 10:35:17 +02:00
"cell_type": "code",
"execution_count": 4,
2018-09-28 10:35:17 +02:00
"metadata": {},
"outputs": [
"data": {
"text/html": [
"text/plain": [
"<folium.folium.Map at 0x7fe931d2d828>"
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
2018-09-28 10:35:17 +02:00
"source": [
"bounds = [\n",
" [53.54434089638824, 6.520699920654293],\n",
" [52.59243228879456, 4.684483127594008]\n",
"center = (bounds[0][0] + bounds[1][0]) / 2, (bounds[0][1] + bounds[1][1]) / 2\n",
"m = folium.Map(center, zoom_start=9, tiles='stamentoner')\n",
"img = folium.raster_layers.ImageOverlay(\n",
" name='Dialect regions',\n",
" image='../data/dialects.png',\n",
" bounds=bounds,\n",
" opacity=0.6,\n",
" interactive=True,\n",
" cross_origin=False,\n",
" zindex=1,\n",
"control = 1\n",
"for corner, symbol0 in zip(['_northEast', '_southWest'], ['⌜', '⌟']):\n",
" for axis, symbol1 in zip(['lng', 'lat'], ['⟷', '↕']):\n",
" for direction in ['-', '+']:\n",
" JsButton(\n",
"# title='{} {} {}'.format(symbol0, symbol1, direction),\n",
" title = str(control),\n",
" function=\"\"\"\n",
" function(btn, map) {{\n",
" var overlay = image_overlay_{overlay_id};\n",
" var bounds = overlay.getBounds();\n",
" bounds.{corner}.{axis} {direction}= 0.001;\n",
" overlay.setBounds(bounds);\n",
" }}\n",
" \"\"\".format(\n",
" overlay_id = img._id, axis=axis, direction=direction,corner=corner\n",
" )).add_to(m)\n",
" control += 1\n",
" \n",
" \n",
" title = '[]',\n",
" function=\"\"\"\n",
" function(btn, map) {{\n",
" var overlay = image_overlay_{overlay_id};\n",
" var bounds = overlay.getBounds();\n",
" console.log([[, bounds._northEast.lng], [,]]);\n",
" }}\n",
" overlay_id = img._id\n",
2018-09-28 10:35:17 +02:00
"cell_type": "markdown",
2018-09-28 10:35:17 +02:00
"metadata": {},
"source": [
"# Locate contours\n",
"Find the countours for the connected components marked by the colors."
"cell_type": "code",
"execution_count": 5,
"metadata": {
"scrolled": false
2018-09-28 10:35:17 +02:00
"outputs": [
"data": {
"text/plain": [
"metadata": {},
"output_type": "display_data"
"data": {
"text/html": [
2018-09-28 10:35:17 +02:00
"text/plain": [
"metadata": {},
"output_type": "display_data"
2018-09-28 10:35:17 +02:00
"source": [
"axes = plt.subplots(2,2)[1].ravel()\n",
"contours = []\n",
"for axis, c in zip(axes, np.array(legend_colors)):\n",
2018-09-28 10:35:17 +02:00
" bi = (im[:-100] == c[None,None]).min(axis=2)\n",
" bi = binary_closing(bi, np.ones((5,5)))\n",
" \n",
" labels = label(bi, background=False)\n",
" contours.append(find_contours(bi, 0.5))\n",
" axis.imshow(bi)\n",
" for n, contour in enumerate(contours[-1][:1]):\n",
" axis.plot(contour[:, 1], contour[:, 0], linewidth=2)\n",
" axis.set_xticks([]); axis.set_yticks([])\n",
"cell_type": "markdown",
"metadata": {},
"source": [
"translate pixel coordinates to latitude - longitudes."
2018-09-28 10:35:17 +02:00
"cell_type": "code",
"execution_count": 6,
2018-09-28 10:35:17 +02:00
"metadata": {},
"outputs": [],
"source": [
"(y0, x1), (y1, x0) = bounds\n",
2018-09-28 10:35:17 +02:00
"scale_x = lambda x: x0 + (x / im.shape[1]) * (x1 - x0)\n",
"scale_y = lambda y: y0 + (y / im.shape[0]) * (y1 - y0)\n",
2018-09-28 10:35:17 +02:00
"contours_scaled = [\n",
" list(zip(scale_x(c[0][:, 1]), scale_y(c[0][:, 0])))\n",
" for c in contours\n",
"cell_type": "markdown",
"metadata": {},
"source": [
"# Result"
2018-09-28 10:35:17 +02:00
"cell_type": "code",
"execution_count": 7,
2018-09-28 10:35:17 +02:00
"metadata": {
"scrolled": true
"outputs": [],
"source": [
"geojson = json.dumps({\n",
" \"type\": \"FeatureCollection\",\n",
" \"features\": [\n",
" {\n",
" \"type\": \"Feature\",\n",
" \"properties\": {'dialect': dialect},\n",
" \"geometry\": {\n",
" \"type\": \"Polygon\",\n",
" \"coordinates\": [list(map(list, contour))]\n",
" }\n",
" }\n",
" for contour, dialect in zip(contours_scaled, regions)\n",
" ]\n",
"with open('../data/frysian_dialect_regions.geojson', 'w') as f:\n",
2018-09-28 10:35:17 +02:00
" f.write(geojson)"
"cell_type": "code",
"execution_count": 8,
2018-09-28 10:35:17 +02:00
"metadata": {},
"outputs": [
"data": {
"text/html": [
2018-09-28 10:35:17 +02:00
"text/plain": [
"<folium.folium.Map at 0x7fe931e575c0>"
2018-09-28 10:35:17 +02:00
"execution_count": 8,
2018-09-28 10:35:17 +02:00
"metadata": {},
"output_type": "execute_result"
"source": [
"m = folium.Map(\n",
" location=center,\n",
2018-09-28 10:35:17 +02:00
" tiles='Mapbox Bright',\n",
" zoom_start=9\n",
"folium.GeoJson('dialect_regions.geojson', name='geojson').add_to(m)\n",
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
2018-09-28 10:35:17 +02:00
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.5"
"nbformat": 4,
"nbformat_minor": 1