How to create a model from my data on Kaggle

  • Step1 prepare our data

1、define a function of searching images

<span>import</span> <span>os</span>
<span>iskaggle</span> <span>=</span> <span>os</span><span>.</span><span>environ</span><span>.</span><span>get</span><span>(</span><span>'</span><span>KAGGLE_KERNEL_RUN_TYPE</span><span>'</span><span>,</span> <span>''</span><span>)</span>
<span>if</span> <span>iskaggle</span><span>:</span>
<span>!</span><span>pip</span> <span>install</span> <span>-</span><span>Uqq</span> <span>fastai</span> <span>'</span><span>duckduckgo_search>=6.2</span><span>'</span>
<span>from</span> <span>duckduckgo_search</span> <span>import</span> <span>DDGS</span>
<span>from</span> <span>fastcore.all</span> <span>import</span> <span>*</span>
<span>import</span> <span>time</span><span>,</span> <span>json</span>
<span>def</span> <span>search_images</span><span>(</span><span>keywords</span><span>,</span> <span>max_images</span><span>=</span><span>200</span><span>):</span>
<span>return</span> <span>L</span><span>(</span><span>DDGS</span><span>().</span><span>images</span><span>(</span><span>keywords</span><span>,</span> <span>max_results</span><span>=</span><span>max_images</span><span>)).</span><span>itemgot</span><span>(</span><span>'</span><span>image</span><span>'</span><span>)</span>
<span>import</span> <span>os</span>
<span>iskaggle</span> <span>=</span> <span>os</span><span>.</span><span>environ</span><span>.</span><span>get</span><span>(</span><span>'</span><span>KAGGLE_KERNEL_RUN_TYPE</span><span>'</span><span>,</span> <span>''</span><span>)</span>

<span>if</span> <span>iskaggle</span><span>:</span>
    <span>!</span><span>pip</span> <span>install</span> <span>-</span><span>Uqq</span> <span>fastai</span> <span>'</span><span>duckduckgo_search>=6.2</span><span>'</span>

<span>from</span> <span>duckduckgo_search</span> <span>import</span> <span>DDGS</span>
<span>from</span> <span>fastcore.all</span> <span>import</span> <span>*</span>
<span>import</span> <span>time</span><span>,</span> <span>json</span>
<span>def</span> <span>search_images</span><span>(</span><span>keywords</span><span>,</span> <span>max_images</span><span>=</span><span>200</span><span>):</span>
    <span>return</span> <span>L</span><span>(</span><span>DDGS</span><span>().</span><span>images</span><span>(</span><span>keywords</span><span>,</span> <span>max_results</span><span>=</span><span>max_images</span><span>)).</span><span>itemgot</span><span>(</span><span>'</span><span>image</span><span>'</span><span>)</span>
import os iskaggle = os.environ.get('KAGGLE_KERNEL_RUN_TYPE', '') if iskaggle: !pip install -Uqq fastai 'duckduckgo_search>=6.2' from duckduckgo_search import DDGS from fastcore.all import * import time, json def search_images(keywords, max_images=200): return L(DDGS().images(keywords, max_results=max_images)).itemgot('image')

Enter fullscreen mode Exit fullscreen mode

2、 search for a dog photo and get URLs from a search

<span>urls</span> <span>=</span> <span>search_images</span><span>(</span><span>'</span><span>dog photos</span><span>'</span><span>,</span> <span>max_images</span><span>=</span><span>1</span><span>)</span>
<span>urls</span> <span>=</span> <span>search_images</span><span>(</span><span>'</span><span>dog photos</span><span>'</span><span>,</span> <span>max_images</span><span>=</span><span>1</span><span>)</span>
urls = search_images('dog photos', max_images=1)

Enter fullscreen mode Exit fullscreen mode

3、download an image and take a look at it

<span>from</span> <span>fastdownload</span> <span>import</span> <span>download_url</span>
<span>dest</span> <span>=</span> <span>'</span><span>dog.jpg</span><span>'</span>
<span>download_url</span><span>(</span><span>urls</span><span>[</span><span>0</span><span>],</span> <span>dest</span><span>,</span> <span>show_progress</span><span>=</span><span>False</span><span>)</span>
<span>from</span> <span>fastai.vision.all</span> <span>import</span> <span>*</span>
<span>im</span> <span>=</span> <span>Image</span><span>.</span><span>open</span><span>(</span><span>dest</span><span>)</span>
<span>im</span><span>.</span><span>to_thumb</span><span>(</span><span>256</span><span>,</span><span>256</span><span>)</span>
<span>from</span> <span>fastdownload</span> <span>import</span> <span>download_url</span>
<span>dest</span> <span>=</span> <span>'</span><span>dog.jpg</span><span>'</span>
<span>download_url</span><span>(</span><span>urls</span><span>[</span><span>0</span><span>],</span> <span>dest</span><span>,</span> <span>show_progress</span><span>=</span><span>False</span><span>)</span>

<span>from</span> <span>fastai.vision.all</span> <span>import</span> <span>*</span>
<span>im</span> <span>=</span> <span>Image</span><span>.</span><span>open</span><span>(</span><span>dest</span><span>)</span>
<span>im</span><span>.</span><span>to_thumb</span><span>(</span><span>256</span><span>,</span><span>256</span><span>)</span>
from fastdownload import download_url dest = 'dog.jpg' download_url(urls[0], dest, show_progress=False) from fastai.vision.all import * im = Image.open(dest) im.to_thumb(256,256)

Enter fullscreen mode Exit fullscreen mode

4、do the same thing for a cat photo

<span>download_url</span><span>(</span><span>search_images</span><span>(</span><span>'</span><span>cat photos</span><span>'</span><span>,</span> <span>max_images</span><span>=</span><span>1</span><span>)[</span><span>0</span><span>],</span> <span>'</span><span>cat.jpg</span><span>'</span><span>,</span> <span>show_progress</span><span>=</span><span>False</span><span>)</span>
<span>Image</span><span>.</span><span>open</span><span>(</span><span>'</span><span>cat.jpg</span><span>'</span><span>).</span><span>to_thumb</span><span>(</span><span>256</span><span>,</span><span>256</span><span>)</span>
<span>download_url</span><span>(</span><span>search_images</span><span>(</span><span>'</span><span>cat photos</span><span>'</span><span>,</span> <span>max_images</span><span>=</span><span>1</span><span>)[</span><span>0</span><span>],</span> <span>'</span><span>cat.jpg</span><span>'</span><span>,</span> <span>show_progress</span><span>=</span><span>False</span><span>)</span>
<span>Image</span><span>.</span><span>open</span><span>(</span><span>'</span><span>cat.jpg</span><span>'</span><span>).</span><span>to_thumb</span><span>(</span><span>256</span><span>,</span><span>256</span><span>)</span>
download_url(search_images('cat photos', max_images=1)[0], 'cat.jpg', show_progress=False) Image.open('cat.jpg').to_thumb(256,256)

Enter fullscreen mode Exit fullscreen mode

5、grab a few examples of each of dog and cat photos, and save each group of photos to a different folder

<span>searches</span> <span>=</span> <span>'</span><span>dog</span><span>'</span><span>,</span> <span>'</span><span>cat</span><span>'</span>
<span>path</span> <span>=</span> <span>Path</span><span>(</span><span>'</span><span>dog_or_not</span><span>'</span><span>)</span>
<span>for</span> <span>o</span> <span>in</span> <span>searches</span><span>:</span>
<span># make sub dirs in dog_or_not </span> <span>dest</span> <span>=</span> <span>(</span><span>path</span><span>/</span><span>o</span><span>)</span>
<span>dest</span><span>.</span><span>mkdir</span><span>(</span><span>exist_ok</span><span>=</span><span>True</span><span>,</span> <span>parents</span><span>=</span><span>True</span><span>)</span>
<span>download_images</span><span>(</span><span>dest</span><span>,</span> <span>urls</span><span>=</span><span>search_images</span><span>(</span><span>f</span><span>'</span><span>{</span><span>o</span><span>}</span><span> photo</span><span>'</span><span>))</span>
<span>time</span><span>.</span><span>sleep</span><span>(</span><span>5</span><span>)</span>
<span>resize_images</span><span>(</span><span>path</span><span>/</span><span>o</span><span>,</span> <span>max_size</span><span>=</span><span>400</span><span>,</span> <span>dest</span><span>=</span><span>path</span><span>/</span><span>o</span><span>)</span>
<span>searches</span> <span>=</span> <span>'</span><span>dog</span><span>'</span><span>,</span> <span>'</span><span>cat</span><span>'</span>
<span>path</span> <span>=</span> <span>Path</span><span>(</span><span>'</span><span>dog_or_not</span><span>'</span><span>)</span>

<span>for</span> <span>o</span> <span>in</span> <span>searches</span><span>:</span>
    <span># make sub dirs in dog_or_not </span>    <span>dest</span> <span>=</span> <span>(</span><span>path</span><span>/</span><span>o</span><span>)</span>
    <span>dest</span><span>.</span><span>mkdir</span><span>(</span><span>exist_ok</span><span>=</span><span>True</span><span>,</span> <span>parents</span><span>=</span><span>True</span><span>)</span>

    <span>download_images</span><span>(</span><span>dest</span><span>,</span> <span>urls</span><span>=</span><span>search_images</span><span>(</span><span>f</span><span>'</span><span>{</span><span>o</span><span>}</span><span> photo</span><span>'</span><span>))</span>
    <span>time</span><span>.</span><span>sleep</span><span>(</span><span>5</span><span>)</span>
    <span>resize_images</span><span>(</span><span>path</span><span>/</span><span>o</span><span>,</span> <span>max_size</span><span>=</span><span>400</span><span>,</span> <span>dest</span><span>=</span><span>path</span><span>/</span><span>o</span><span>)</span>
searches = 'dog', 'cat' path = Path('dog_or_not') for o in searches: # make sub dirs in dog_or_not dest = (path/o) dest.mkdir(exist_ok=True, parents=True) download_images(dest, urls=search_images(f'{o} photo')) time.sleep(5) resize_images(path/o, max_size=400, dest=path/o)

Enter fullscreen mode Exit fullscreen mode

6、remove the photos which might not be downloaded correctly causing our model training to fail

<span>failed</span> <span>=</span> <span>verify_images</span><span>(</span><span>get_image_files</span><span>(</span><span>path</span><span>))</span>
<span>failed</span><span>.</span><span>map</span><span>(</span><span>Path</span><span>.</span><span>unlink</span><span>)</span>
<span>failed</span> <span>=</span> <span>verify_images</span><span>(</span><span>get_image_files</span><span>(</span><span>path</span><span>))</span>
<span>failed</span><span>.</span><span>map</span><span>(</span><span>Path</span><span>.</span><span>unlink</span><span>)</span>
failed = verify_images(get_image_files(path)) failed.map(Path.unlink)

Enter fullscreen mode Exit fullscreen mode

  • Step2 train our model

1、creat the dataloaders using a DataBlock

<span>dls</span> <span>=</span> <span>DataBlock</span><span>(</span>
<span>blocks</span><span>=</span><span>(</span><span>ImageBlock</span><span>,</span> <span>CategoryBlock</span><span>),</span>
<span>get_items</span><span>=</span><span>get_image_files</span><span>,</span>
<span>splitter</span><span>=</span><span>RandomSplitter</span><span>(</span><span>valid_pct</span><span>=</span><span>0.2</span><span>,</span> <span>seed</span><span>=</span><span>42</span><span>),</span>
<span>get_y</span><span>=</span><span>parent_label</span><span>,</span>
<span>item_tfms</span><span>=</span><span>[</span><span>Resize</span><span>(</span><span>192</span><span>,</span> <span>method</span><span>=</span><span>'</span><span>squish</span><span>'</span><span>)]</span>
<span>).</span><span>dataloaders</span><span>(</span><span>path</span><span>,</span> <span>bs</span><span>=</span><span>32</span><span>)</span>
<span>dls</span><span>.</span><span>show_batch</span><span>(</span><span>max_n</span><span>=</span><span>6</span><span>)</span>
<span>dls</span> <span>=</span> <span>DataBlock</span><span>(</span>
    <span>blocks</span><span>=</span><span>(</span><span>ImageBlock</span><span>,</span> <span>CategoryBlock</span><span>),</span>
    <span>get_items</span><span>=</span><span>get_image_files</span><span>,</span>
    <span>splitter</span><span>=</span><span>RandomSplitter</span><span>(</span><span>valid_pct</span><span>=</span><span>0.2</span><span>,</span> <span>seed</span><span>=</span><span>42</span><span>),</span>
    <span>get_y</span><span>=</span><span>parent_label</span><span>,</span>
    <span>item_tfms</span><span>=</span><span>[</span><span>Resize</span><span>(</span><span>192</span><span>,</span> <span>method</span><span>=</span><span>'</span><span>squish</span><span>'</span><span>)]</span>
<span>).</span><span>dataloaders</span><span>(</span><span>path</span><span>,</span> <span>bs</span><span>=</span><span>32</span><span>)</span>
<span>dls</span><span>.</span><span>show_batch</span><span>(</span><span>max_n</span><span>=</span><span>6</span><span>)</span>
dls = DataBlock( blocks=(ImageBlock, CategoryBlock), get_items=get_image_files, splitter=RandomSplitter(valid_pct=0.2, seed=42), get_y=parent_label, item_tfms=[Resize(192, method='squish')] ).dataloaders(path, bs=32) dls.show_batch(max_n=6)

Enter fullscreen mode Exit fullscreen mode

2、use the pretrained model and finetune it on our dataset

<span>learn</span> <span>=</span> <span>vision_learner</span><span>(</span><span>dls</span><span>,</span> <span>resnet50</span><span>,</span> <span>metrics</span><span>=</span><span>error_rate</span><span>)</span>
<span>learn</span><span>.</span><span>fine_tune</span><span>(</span><span>3</span><span>)</span>
<span>learn</span> <span>=</span> <span>vision_learner</span><span>(</span><span>dls</span><span>,</span> <span>resnet50</span><span>,</span> <span>metrics</span><span>=</span><span>error_rate</span><span>)</span>
<span>learn</span><span>.</span><span>fine_tune</span><span>(</span><span>3</span><span>)</span>
learn = vision_learner(dls, resnet50, metrics=error_rate) learn.fine_tune(3)

Enter fullscreen mode Exit fullscreen mode

  • Step3 Use our model

1、Use the dog photo that we downloaded at the start to see what our model thinks about

<span>is_dog</span><span>,</span><span>_</span><span>,</span><span>probs</span> <span>=</span> <span>learn</span><span>.</span><span>predict</span><span>(</span><span>PILImage</span><span>.</span><span>create</span><span>(</span><span>'</span><span>dog.jpg</span><span>'</span><span>))</span>
<span>print</span><span>(</span><span>f</span><span>'</span><span>This is a: </span><span>{</span><span>is_dog</span><span>}</span><span>.</span><span>'</span><span>)</span>
<span>print</span><span>(</span><span>f</span><span>"</span><span>Probability it</span><span>'</span><span>s a dog: </span><span>{</span><span>probs</span><span>[</span><span>1</span><span>]</span><span>:</span><span>.</span><span>4</span><span>f</span><span>}</span><span>"</span><span>)</span>
<span>is_dog</span><span>,</span><span>_</span><span>,</span><span>probs</span> <span>=</span> <span>learn</span><span>.</span><span>predict</span><span>(</span><span>PILImage</span><span>.</span><span>create</span><span>(</span><span>'</span><span>dog.jpg</span><span>'</span><span>))</span>
<span>print</span><span>(</span><span>f</span><span>'</span><span>This is a: </span><span>{</span><span>is_dog</span><span>}</span><span>.</span><span>'</span><span>)</span>
<span>print</span><span>(</span><span>f</span><span>"</span><span>Probability it</span><span>'</span><span>s a dog: </span><span>{</span><span>probs</span><span>[</span><span>1</span><span>]</span><span>:</span><span>.</span><span>4</span><span>f</span><span>}</span><span>"</span><span>)</span>
is_dog,_,probs = learn.predict(PILImage.create('dog.jpg')) print(f'This is a: {is_dog}.') print(f"Probability it's a dog: {probs[1]:.4f}")

Enter fullscreen mode Exit fullscreen mode

This is a: dog.
Probability it’s a dog: 1.0000

原文链接:How to create a model from my data on Kaggle

© 版权声明
THE END
喜欢就支持一下吧
点赞14 分享
As long as there s tomorrow, today s always the startng lne.
只要还有明天,今天就永远是起跑线
评论 抢沙发

请登录后发表评论

    暂无评论内容