{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Number of projects per kickstarter category\n", "\n", "For each category and subcategory, find out how many projects there are in total, are / were sucessful, and are live." ] }, { "cell_type": "code", "execution_count": 182, "metadata": {}, "outputs": [], "source": [ "import json\n", "import time\n", "import datetime\n", "import selenium\n", "from selenium import webdriver\n", "from multiprocessing import Pool\n", "from jupyter_progressbar import ProgressBar\n", "from ipy_table import make_table, set_row_style\n", "from IPython.display import display, Image, HTML" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Executed around:" ] }, { "cell_type": "code", "execution_count": 191, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2018-01-03 17:56\n" ] } ], "source": [ "d = datetime.datetime.now()\n", "print(d.strftime('%Y-%m-%d %H:%M'))" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "driver = webdriver.Chrome()\n", "\n", "root = 'https://www.kickstarter.com/'\n", "driver.get(root)\n", "driver.execute_script('$(\".section_global-nav-left > button:first-child\").click()')\n", "time.sleep(3)\n", "category_links = driver.execute_script(\"return $('a').map(function(i, x) { return $(x).attr('href'); }).filter(function(i, x) { return x.indexOf('/discover/categories') >= 0; })\")\n", "category_links = list(set(category_links))\n", "\n", "driver.close()\n", "driver.quit()" ] }, { "cell_type": "code", "execution_count": 145, "metadata": { "scrolled": true }, "outputs": [], "source": [ "def get_count(driver, url):\n", " driver.get(url)\n", " try:\n", " return next(\n", " int(element.text.replace(' projects', '').replace(',', ''))\n", " for element in driver.find_elements_by_class_name('count')\n", " if element.text.endswith(' projects')\n", " )\n", " except StopIteration:\n", " return -1\n", "\n", "def get_rows(urls):\n", " try:\n", " driver = webdriver.Chrome()\n", " result = []\n", " for url in urls:\n", " category = url.split('?')[0][len('https://www.kickstarter.com/discover/categories/'):].replace('%20', ' ').replace('%2520', ' ')\n", "\n", " category, subcategory = (category.split('/') + ['', ''])[:2]\n", "\n", " all_projects = get_count(driver, url)\n", " live_projects = get_count(driver, url + '&state=live')\n", " success_projects = get_count(driver, url + '&state=successful')\n", "\n", " result.append([category, subcategory, all_projects, success_projects, live_projects])\n", " finally:\n", " driver.quit()\n", " return result\n", "\n", "results = []\n", "pool = Pool(8)\n", "for start, to in zip(range(0, len(category_links), 11), range(11, len(category_links)+1, 11)):\n", " results.append(pool.apply_async(get_rows, [category_links[start:to]]))" ] }, { "cell_type": "code", "execution_count": 192, "metadata": {}, "outputs": [], "source": [ "table = [['category', 'subcategory', 'total', 'successful', 'live']]\n", "\n", "for part in results:\n", " assert part.ready()\n", " table.extend(part.get())\n", "table = table[:1] + sorted(table[1:])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Result\n", "\n", "Green indicates a category (not a subcategory), red indicates over 2400 projects, the limit to scrape successfully." ] }, { "cell_type": "code", "execution_count": 193, "metadata": { "scrolled": false }, "outputs": [ { "data": { "text/html": [ "
category | subcategory | total | successful | live |
art | 28151 | 11497 | 207 | |
art | ceramics | 308 | 134 | 5 |
art | conceptual art | 1027 | 366 | 8 |
art | digital art | 1348 | 374 | 13 |
art | illustration | 3192 | 1636 | 44 |
art | installations | 484 | 235 | 7 |
art | mixed media | 2757 | 948 | 19 |
art | painting | 3294 | 1145 | 19 |
art | performance art | 2151 | 930 | 6 |
art | public art | 3085 | 1549 | 9 |
art | sculpture | 1809 | 697 | 4 |
art | textiles | 268 | 72 | 2 |
art | video art | 193 | 52 | 2 |
comics | 10846 | 5855 | 81 | |
comics | anthologies | 404 | 303 | 3 |
comics | comic books | 2792 | 1645 | 34 |
comics | events | 160 | 47 | 2 |
comics | graphic novels | 1862 | 1014 | 19 |
comics | webcomics | 657 | 359 | 5 |
crafts | 8785 | 2090 | 81 | |
crafts | candles | 427 | 55 | 3 |
crafts | crochet | 163 | 35 | -1 |
crafts | diy | 1176 | 184 | 21 |
crafts | embroidery | 116 | 21 | 1 |
crafts | glass | 140 | 38 | 1 |
crafts | knitting | 184 | 85 | 1 |
crafts | pottery | 98 | 44 | 3 |
crafts | quilts | 81 | 19 | -1 |
crafts | stationery | 230 | 82 | 2 |
crafts | taxidermy | 13 | 5 | -1 |
crafts | weaving | 92 | 25 | 1 |
crafts | woodworking | 1166 | 293 | 12 |
dance | 3774 | 2341 | 18 | |
dance | performances | 1018 | 628 | 7 |
dance | residencies | 69 | 49 | -1 |
dance | spaces | 201 | 71 | 2 |
dance | workshops | 164 | 51 | 4 |
design | 30116 | 10538 | 309 | |
design | architecture | 758 | 178 | 9 |
design | civic design | 290 | 104 | 5 |
design | graphic design | 2007 | 752 | 10 |
design | interactive design | 393 | 78 | 4 |
design | product design | 22437 | 7998 | 243 |
design | typography | 106 | 63 | 2 |
fashion | 22847 | 5626 | 255 | |
fashion | accessories | 3195 | 1095 | 59 |
fashion | apparel | 7199 | 1441 | 110 |
fashion | childrenswear | 492 | 95 | 3 |
fashion | couture | 275 | 42 | 5 |
fashion | footwear | 929 | 299 | 8 |
fashion | jewelry | 1224 | 301 | 22 |
fashion | pet fashion | 141 | 39 | -1 |
fashion | ready-to-wear | 874 | 148 | 11 |
film & video | -1 | -1 | -1 | |
film & video | 64758 | 24041 | 341 | |
film & video | action | 739 | 107 | 5 |
film & video | animation | 2551 | 685 | 16 |
film & video | comedy | 2135 | 825 | 27 |
film & video | documentary | 16144 | 5925 | 61 |
film & video | drama | 2168 | 806 | 36 |
film & video | experimental | 556 | 146 | 4 |
film & video | family | 335 | 67 | 4 |
film & video | fantasy | 341 | 105 | 5 |
film & video | festivals | 291 | 133 | 2 |
film & video | horror | 1298 | 402 | 16 |
film & video | music videos | 703 | 239 | 7 |
film & video | narrative film | 5191 | 2016 | 14 |
film & video | romance | 186 | 53 | 1 |
film & video | science fiction | 746 | 273 | 8 |
film & video | shorts | 12372 | 6685 | 48 |
film & video | television | 1012 | 155 | 9 |
film & video | thrillers | 753 | 204 | 10 |
film & video | webseries | 5758 | 1697 | 15 |
food | 24634 | 6107 | 194 | |
food | bacon | 219 | 38 | 1 |
food | community gardens | 296 | 67 | 2 |
food | cookbooks | 544 | 136 | 9 |
food | drinks | 2432 | 597 | 33 |
food | events | 658 | 108 | 3 |
food | farmer's markets | 429 | 72 | 5 |
food | farms | 1154 | 246 | 12 |
food | food trucks | 1757 | 220 | 20 |
food | restaurants | 2828 | 458 | 30 |
food | small batch | 1816 | 558 | 22 |
food | spaces | 427 | 122 | 4 |
food | vegan | 593 | 187 | 7 |
games | 35300 | 12571 | 307 | |
games | gaming hardware | 434 | 103 | 4 |
games | live games | 1051 | 181 | 6 |
games | mobile games | 2032 | 202 | 22 |
games | playing cards | 2487 | 963 | 41 |
games | puzzles | 227 | 85 | 4 |
games | video games | 11640 | 2354 | 89 |
journalism | 4755 | 1020 | 33 | |
journalism | audio | 408 | 111 | 4 |
journalism | photo | 195 | 33 | -1 |
journalism | 729 | 165 | 4 | |
journalism | video | 426 | 51 | 2 |
journalism | web | 1248 | 185 | 14 |
music | 54224 | 26767 | 283 | |
music | blues | 267 | 118 | 2 |
music | classical music | 2620 | 1653 | 11 |
music | comedy | 19 | 6 | 2 |
music | country & folk | 4461 | 2818 | 17 |
music | electronic music | 2175 | 701 | 14 |
music | faith | 1094 | 455 | 11 |
music | hip-hop | 3915 | 604 | 33 |
music | indie rock | 5659 | 3621 | 14 |
music | jazz | 1862 | 1111 | 16 |
music | kids | 282 | 124 | 3 |
music | latin | 140 | 39 | 5 |
music | metal | 719 | 275 | 6 |
music | pop | 3358 | 1563 | 20 |
music | punk | 318 | 146 | 4 |
music | r&b | 461 | 108 | 3 |
music | rock | 6766 | 3504 | 31 |
music | world music | 2108 | 927 | 12 |
photography | 10782 | 3300 | 52 | |
photography | animals | 257 | 63 | 4 |
photography | fine art | 771 | 282 | 8 |
photography | people | 1098 | 229 | 7 |
photography | photobooks | 1597 | 643 | 15 |
photography | places | 745 | 120 | 3 |
publishing | 40145 | 12325 | 300 | |
publishing | academic | 916 | 186 | 11 |
publishing | anthologies | 383 | 219 | 5 |
publishing | art books | 2693 | 1366 | 20 |
publishing | calendars | 333 | 131 | 8 |
publishing | children's books | 6771 | 2349 | 43 |
publishing | comedy | 73 | 23 | 3 |
publishing | fiction | 9176 | 2243 | 48 |
publishing | letterpress | 48 | 30 | 1 |
publishing | literary journals | 276 | 130 | 5 |
publishing | literary spaces | 45 | 31 | 3 |
publishing | nonfiction | 8297 | 2224 | 48 |
publishing | periodicals | 1263 | 514 | 6 |
publishing | poetry | 1375 | 488 | 9 |
publishing | radio & podcasts | 924 | 394 | 6 |
publishing | translations | 158 | 35 | 5 |
publishing | young adult | 823 | 172 | 9 |
publishing | zines | 391 | 179 | 8 |
technology | 32610 | 6474 | 380 | |
technology | 3d printing | 691 | 247 | 8 |
technology | apps | 6356 | 381 | 78 |
technology | camera equipment | 426 | 198 | 5 |
technology | diy electronics | 912 | 419 | 10 |
technology | fabrication tools | 248 | 67 | 4 |
technology | flight | 422 | 73 | 4 |
technology | gadgets | 3064 | 879 | 49 |
technology | hardware | 3670 | 1217 | 27 |
technology | makerspaces | 237 | 75 | 1 |
technology | robots | 574 | 227 | 4 |
technology | software | 3036 | 373 | 30 |
technology | sound | 699 | 289 | 16 |
technology | space exploration | 321 | 119 | 3 |
technology | wearables | 1232 | 381 | 24 |
technology | web | 3887 | 257 | 42 |
theater | 10820 | 6478 | 41 | |
theater | comedy | 100 | 61 | 4 |
theater | experimental | 373 | 209 | 3 |
theater | festivals | 547 | 322 | 2 |
theater | immersive | 335 | 173 | 1 |
theater | musical | 916 | 465 | 7 |
theater | plays | 1382 | 807 | 15 |
theater | spaces | 208 | 95 | 1 |