AR Bedtime Story

6.S063 Spring 2017 Final Project

by Cattalyya Nuengsigkapian

Project Ideas'

Project Info


The app that let everybody create and share their own bedtime story with their favorite characters and toys at the same time.

Project Ideas

Demo Video

  • Brainstrom Phase

  • Week 1: Sep 18 - 24, 2017

    Research

    This week I listed the project area that I'm interested such as AR, Computer Vision, Photography etc. I research on related project and research papers to obtain more ideas and thought about how reasonable it is to invent in a single semester.

  • Week 2: Sep 25 - Oct 1, 2017

    Brainstrom Project Ideas

    • 1. Mirror Mirror On The Wall: talk or interact with gesture to mirror to see weather forcast, map, photo albums etc.
    • 2. Gesture & Mind Controlling computer: use point gesture as a mouse and use brain focusing to click using Kinect and BCI device.
    • 3. Remote Communication via Doll: simulate doll motion with realtime human motion to allow user to communicate remotely.
    • 4. AR Bedtime Story: use printed sign to represent model or figure and allow user to move and rotate them to create story.
    • 5. Mind Photo Taking: use BCI technology to capture image for very focusing moment.

    Project Ideas
  • Week 3: Oct 2 - 8, 2017

    Graphic Elements

    Practice drawing using Illustrator and included it in presentation.

  • Week 4: Oct 9 - 15, 2017

    Top 5 ideas

    This week I present my top 5 ideas to the class: Top 5 idea slide

  • Week 5: Oct 16 - 22, 2017

    Requirements

    Make final decidion for project idea, research on existing sensors, and find how to satisfy all requirements.

    • 1. Lasercut and 3D print will be used to create objects representing characters in story (plan to attach QR code to this object combined with 3D color and shape).
    • 2. Custom electronics: Use Reflective Optical Sensor and Light Dependent Resistor to allow object to follow laser path or black drawing path + interact with light
    • 3. Custom code and data being send via wifi: Send position of object over wifi for light-interactive object to response to light different way: mostly focus on moving object to correct location.
    • 4. React to user input: after user move object to different location, we use mobile to detect object by QR code or other better technology and display 3D object in mobile.
    • 5. Solve real-world problem: This allow both parents and children to interact with figure move them around together to create their own story for better bedtime story experience instead of static image in book.
  • Beginning Phase of AR Bedtime Story

  • Week 6: Oct 23 - 29, 2017

    Choosing AR platforms

    After I decided to work on AR Bedtime Story, I began to research on iOS Augmented Reality development and chose among Swift, Objective-C, React Native, and Unity. I'm familiar with React Native, but it didn't have much tool supporting AR, so I initially used Swift with its library ARKit hoping that I will have more power to control the mobile than using Unity. After I played around with it for serveral days, I believe that to achieve all my features within a month scope, I needed tools that will facilitate image recognition which Vuforia platform in Unity can provide to me.

  • Week 7: Oct 30 - Nov 5, 2017

    Vuforia Unity + Lasercut design

    This week, I learned about Vuforia, its image target, markers, etc. and started building my iOS AR app with Unity. I finally produced the AR iOS app that can scan example image targets and show the 3D figures. I designed my laser-cut parts that will be figure basements (image frames) to turn normal printed paper to toys. The criteria of my figure basements are 1. endurance - water proof, torn 2. low cost 3. scalable - can be reused 4. safety.

    For electronics part, I researched on making light following robot and plan to do bristle bot. I also built simple circuit to read light dependent resistor from microcontroller and displayed its analog values.

  • Week 8: Nov 6 - 12, 2017

    Midterm Checkpoint

    I worked on lasercut and fix my drawing to make all the pieces fit tightly together. For software side, I tried using custom image target and integrated to Vuforia Image Cloud Recognition. [Dynamic image target] I implemented UI in the app that allow user to click button to change the model on a specific image target. [Story selector] I used cloud reco to scan image that represent the story title such as Cinderella and Beauty and the Beast to assign all related figures to all the image targets. On wednesday, I present my half-way project during the midterm presentation.

  • Continue Features...

  • Week 9: Nov 13 - 19, 2017

    Fix bugs + Electronics change decision

    The beta version currently has bug that when user scan story title image such as Beauty and the Beast, figures on image targets aren't immediately changed. To refresh figures, we have to bring image target out of camera frame and bring it into the frame again to trigger image target deteaction for refreshing. I found and solve the bug by register listeners, all target images, to the story title change story target image.

    Moreover, I also plan about how can I solved the scalability issue when users have hard time finding their desire figures among possibly thousands of figures in the future. By this, I plan to use speech recognition to help assign models to different image targets.

    About the electronics part, I then decided to make smart basement (image frame) rather than light following robot, since I believe that I can make more interactive user experience with that. I planed to use soft circular potentionmeter to detect user touch dragging direction. This dragging gesture will be used to select or scale the model. Note that these features are replaced by changing animation and clothes which are more interesting since the model selection can be done with speech command.

  • Week 10: Nov 20 - 26, 2017

    Video Recording and Sharing

    After searching for video screen recording, I didn't found tool that support voice record at the same time as screen recording. I integrated Everyplay assets to my app to support video screen recording and sharing. It also supported Facecam: recording our user face from front camera including our voice at the same time, but this feature isn't work with our AR app since it cause bug frozing our AR screen. Without the Facecam, Everyplay didn't support voice recording along the screen recording, but this issue is acceptable since Everyplay support video editing that user can record their voice easily and my extra plan on having subtitle can be used to help post record editing.

    For Bedtime Story Sharing, Everyplay supported social media sharing such as Facebook and Twitter. I also created my AR Bedtime Story space in Everyplay for my users to share their stories with other users.

  • Advance Features

  • Week 11: Nov 27 - Dec 3, 2017

    Speech Recognition: Subtitle and Command

    I integrated Google Cloud Speech Recognition and displayed subtitle on screen. Speech recognition supported 2 options 1. autodetected speech 2. manual clicking start and stop speech recognition. The first option is very convenience and fits our goal for subtitle, but I also keep the second option that could be better for noisy environment and for voice command.

    I made use of auto voice detection by extracting keywords from speech and auto assign the model to an unoccupied image target. This auto model assignment can be enabled by clicking "clear model" button to clear targets.

    I also make speech recognition support command to assign and clear model from the specified target using speech like "Command assign model_name to image_target" or "Command clear image_target" as demonstrate in the Demo Video.

    [Electronics] After I got my soft potentiometer, I built the circuit and read its value as I did in earlier with the LDR.

  • Week 12: Dec 4 - 10, 2017

    Server and Electronics

    I tried to integrate switch and external power to my circuit, but I connected it incorrectly and blowed up MicroPython, so I have to ask for a new MicroPython from TA.

    Without switch, I tried sending touch position data via wifi. I wrote main.py script for MicroPython that will connect the microcontroller with my iPhone hotspot and send data to ThingSpeak. However, ThingSpeak didn't allow me to successfully send multiple data in short length of time: I have to wait around 30s after latest data sent to send another data.

    Since the realtime behavior is important aspect to allow user to interact with the model immediately. I decided to implemented my own server with NodeJS and use normal HTTP request to send and request for data. Since my server only support my single client rather than million clients on ThingSpeak my request is very fast and can be done repeatedly in every millisecond.

    The data sent is the sequence of touch position for each dragging starting from touching and releasing. By groupping into touch sequence rather than a single touch position, I avoided major useless request that can took up network resources.

    [Lasercut] [Electronics] After I got a new MicroPython, I designed lasercut for the smart basement, soldering and assemble them together.

  • Week 13: Dec 11 - 17, 2017

    Use wifi data + Speech to enable special effects

    [Wifi Data] I did GET request from iOS app to check if there is any new dragging every 2s and obtain dragging speed and direction. I chose 2s to avoid draining battery and using too much internet connection resource while maintaining realtime manner because people dragging usually took about 1.5s, so 0.5s + network lagging from my own server is very small.

    [Server][UI] I reduce computation burden in iOS app by making server compute and translate touch data sent from microcontroller into dragging direction and speed. The touch gestures are 1. dragging Clockwise to change animation of 3D models 2. dragging Counter-clockwise to change clothes (called version in the code) 3. Static touch - change the model. The server compute speed and direction for 1,2 and average touch position for 3. Although I didn't ended up fully exploit all these data like speed, I thought it will be useful to scale project to support more gesture input for more control over model in the future.

    [Speech] I further added more special effects from speech keywords such as "raining", "rain" to produce rain and raining sound and "snow", "snowing" to produce snow animation in the scene.

    Final Presentation Slide