Tensorflow(@CVision)

خانه ی #هوشمند مارک #زاکربرگ بنیان گذار فیس بوک که از متدهای نوین هوش مصنوعی نظیر بازشناسی شئ، بازشناسی چهره، بازشناسی گفتار، پردازش زبان‌های طبیعی و ... بهره برده است.
زاکربرگ از انگیزه ی خود برای این کار و گام های انجام کارش می‌نویسد:

https://www.facebook.com/notes/mark-zuckerberg/building-jarvis/10154361492931634/

چالش شخصی من برای سال 2016 ساخت یک هوش مصنوعی ساده برای خانه ام بوده - مثل جارویس در فیلم مرد آهنین...

Building Jarvis:
- Getting Started: Connecting the Home
- #Natural_Language
- #Vision and #Face_Recognition
- Messenger Bot
- Voice and #Speech_Recognition
- Facebook Engineering Environment

—------
Vision and Face Recognition:
About one-third of the human #brain is dedicated to vision, and there are many important #AI problems related to understanding what is happening in images and videos. These problems include #tracking (eg is Max awake and moving around in her crib?), #object_recognition (eg is that Beast or a rug in that room?), and face recognition (eg who is at the door?).
Face recognition is a particularly difficult version of object recognition because most people look relatively similar compared to telling apart two random objects — for example, a sandwich and a house. But Facebook has gotten very good at face recognition for identifying when your friends are in your photos. That expertise is also useful when your friends are at your door and your AI needs to determine whether to let them in.
To do this, I installed a few cameras at my door that can capture images from all angles. AI systems today cannot identify people from the back of their heads, so having a few angles ensures we see the person's face. I built a simple server that continuously watches the cameras and runs a two step process: first, it runs face detection to see if any person has come into view, and second, if it finds a face, then it runs face recognition to identify who the person is. Once it identifies the person, it checks a list to confirm I'm expecting that person, and if I am then it will let them in and tell me they're here.
This type of visual AI system is useful for a number of things, including knowing when Max is awake so it can start playing music or a Mandarin lesson, or solving the context problem of knowing which room in the house we're in so the AI can correctly respond to context-free requests like "turn the lights on" without providing a location. Like most aspects of this AI, vision is most useful when it informs a broader model of the world, connected with other abilities like knowing who your friends are and how to open the door when they're here. The more context the system has, the smarter is gets overall.

#mark_zuckerberg #smart_home

2.4K viewsedited 19:33

#مقاله
مقاله ی جدید و جالب Google Brain + کد #تنسرفلو
آموزش یک شبکه عصبی برای چندین کار مختلف همزمان!

One Model To Learn Them All
(Submitted on 16 Jun 2017)
pic: http://deepnn.ir/tensorflow-telegram-files/tensor2tensor.PNG

🔗abstract:
https://arxiv.org/abs/1706.05137

🔗Paper:
https://arxiv.org/pdf/1706.05137.pdf

🔗Code:
https://github.com/tensorflow/tensor2tensor

یادگیری عمیق در بسیاری از زمینه ها نظیر تشخیص گفتار، طبقه بندی تصویر، ترجمه و ... استفاده می‌شود.
اما تا کنون بدین نحو بوده که برای هر مساله، یک مدل عمیق با یک معماری خاص انتخاب میشد و با تنظیم پارامترها و با فرآیند یادگیری و تنظیم اوزان شبکه برای آن مساله به خوبی کار میکرد اما برای مسائل دیگر قابل استفاده نبود.
در این مقاله یک مدل واحد که در حوزه های مختلف نتایج خوبی داشته استفاده شده و چندین کار را آموزش دیده است. به طور خاص، این مدل تنها به صورت همزمان در ImageNet، وظایف مختلف ترجمه، شرح تصویر، تشخیص گفتار، و کار تجزیه زبان انگلیسی آموزش داده است.
این مدل در بسیاری از مسائل با مدلهای state-of-the-art هر حوزه که فقط برای آن کار آموزش دیده اند قابل مقایسه بوده و در برخی از حوزه ها کارایی بهتری نسبت به زمانی که فقط برای همان حوزه آموزش دیده شده گزارش شده است.

# Google_Brain #tensor2tensor
#deep_learning
#speech_recognition, #image_classification, #translation

2.3K viewsedited 20:35

Tensorflow(@CVision)

#سورس_کد #مقاله

در این روش که چند روز پیش توسط فیس بوک اوپن سورس شده آموزش speech recognition به صورت ‌end-to-end صورت میگیرد .

Open sourcing wav2letter++, the fastest state-of-the-art speech system, and flashlight, an ML library going native

https://code.fb.com/ai-research/wav2letter/

CNN architectures are competitive with #recurrent architectures for tasks in which modeling long-range dependencies is important, such as #language_modeling, machine translation, and #speech_synthesis. In end-to-end #speech_recognition, however, recurrent architectures are still more prevalent for both acoustic and language modeling.

Engineering at Meta

Open sourcing wav2letter++, the fastest state-of-the-art speech system, and flashlight, an ML library going native

Wav2letter++ is the fastest state-of-the-art end-to-end speech recognition system available. We're also releasing flashlight, a fast, flexible ML library.

2.5K viewsAlireza Akhavan, 10:10

Tensorflow(@CVision)

#سورس_کد

#Mozilla has released open source #speech recognition model & data. Word error rate 6.5%, which is close to human.

Project DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques, based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow project to make the implementation easier.

Data: https://voice.mozilla.org/data
400k recordings, 500 hours of speech.

Model: https://github.com/mozilla/DeepSpeech
TensorFlow implementation of Baidu's DeepSpeech architecture.

https://deepspeech.readthedocs.io/en/latest/
DeepSpeech’s code documentation!

مرتبط با:
https://t.me/cvision/875
https://t.me/cvision/850

#speech_recognition #Tensorflow

commonvoice.mozilla.org

Common Voice by Mozilla

Common Voice is a project to help make voice recognition open to everyone. Now you can donate your voice to help us build an open-source voice database that anyone can use to make innovative apps for devices and the web.

2.0K viewsAlireza Akhavan, 09:31

About

Blog

Apps

Platform