How to improve 3D model & capture limitations

Im working on a 3D model of a home, that has some tall trees in backyard and adjacent side lot. I’ve tried various DD flight plans ranging from 85ft - 150ft in height, as well as varying distances from POI. I think that aspect is ok. Where I’m struggling is, when trying to get the side wall details between the trees and the house to be accurate,… it’s been pretty crazy. I will post both links of two different approaches.

Im curious how low can I fly and what is least amount of gimbal pitch I can use to get these details? I’m wondering if I can fly 10-15ft off of the ground and have gimbal at zero or -5 degrees and fly straight lines ( with camera on 5 sec shot ) around the house between the trees unobstructed in addition to the POI orbits and DD flight plan??

Attempt 1.
https://www.dronedeploy.com/app2/data/5eb596537db5b57efff80ac9;jwt_token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzUxMiJ9.eyJleHAiOjI1MzQwMjMwMDc5OSwiaWQiOiI1ZWI1OTY1MzdkYjViNTdlZmZmODBhYzkiLCJzY29wZSI6WyI3NDU4NDAyN2E0X0VBRjYxODBFQTVPUEVOUElQRUxJTkUiXSwidHlwZSI6IlJlYWRPbmx5UGxhbiJ9.W43kcS7OfxnAIZe_GEfIqVN1Rm2D7tV_tmzKCOG4SxIciTbo_zYF0SC95goB75hYn30hqEyiIX18LEeaEO_seQ

Attempt 2:

https://www.dronedeploy.com/app2/data/5eb70f62b4a9b132ef758c11;jwt_token=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzUxMiJ9.eyJleHAiOjI1MzQwMjMwMDc5OSwiaWQiOiI1ZWI3MGY2MmI0YTliMTMyZWY3NThjMTEiLCJzY29wZSI6WyJlYzkwZTNhZGI4X0VBRjYxODBFQTVPUEVOUElQRUxJTkUiXSwidHlwZSI6IlJlYWRPbmx5UGxhbiJ9.Kcyjzvw6qEyd-eFd7vdrLtMZ-6lgWd8RK7nFiE-nKlov8XI7o1YoXNkW6HYsY2qSsb9XhtBtbMW1EYlckVxBLA

Any help is greatly appreciated

I’m guessing we’re looking at the one on the north/northwest side of the covered area? With the large tree in the backyard? I wouldn’t use anything less than a 15 degree pitch. You’ll have to get it from every angle you can, but the most important factor is to make sure that you do not get the horizon or anything too far behind the focal point that may get blurry.

Virtual reality has been called an internet-sized opportunity. But the Internet only attracted the billions of users it has today once every user was empowered to create, contribute and share content, whether they could code or not.

Similarly, making something today for VR generally requires coding skills. However, there may be an easy solution from the surveying industry: Photogrammetry.

Photogrammetry is any process that uses photos to make measurements. Initially, surveyors used it to create flat maps. Google used it to create its 3D-map Google Earth from high-resolution aerial photography. But specialized software has made the technology very easy to use, especially when each of us have a high-resolution smartphone camera in our pockets.

Screenshot of Valve’s The Lab
However, one of the limitations of photogrammetry is that the end-product doesn’t move. You can move it, but it isn’t alive. So whatever you make needs to be captivating while motionless.

We weren’t the first to this idea, nor to find an ideal use. To demonstrate the HTC Vive and welcome new users to VR, Valve used photogrammetry to create a breathtaking mountain vista in “The Lab.”

So at our lab, we set-out to understand how journalists could use photogrammetry for improved storytelling on the web and VR.

Ideation
One of the first things to acknowledge is the “if it’s not broke, don’t fix it” dilemma of new technology.

Not every story actually benefits with 3D models made using photogrammetry, so to decide when photogrammetry may be valuable we developed three questions:

Will exploration be a key aspect of the story?
Will the model have a visceral, visual impact when it is motionless?
Will you have the control to get your subject to remain still long enough to take all of the photos?
Locations could easily pass all three questions, but, as practitioners, we knew most journalism is people-driven. So we found ourselves with a new question:

When are people visually explorable, viscerally impressive, and incredibly still, without coercion?

A portrait.

But why should a portrait be 3D?

We didn’t immediately have an answer to this question—but a use-case solution fell into our laps when we realized that an international cosplay competition would be in town in 2 weeks.

The contestants would have rich stories of why they put hundreds of hours of labor into costumes with stunning details from head-to-toe, front-and-back.

Capture Technique
Over a short period, we recognized three things we needed to understand and practice as we planned the C2E2 story.

Can we optimize our technique so that a cosplayer only needs to stand motionless for a few seconds instead of minutes?
Photogrammetry was designed to sync still photos, however, taking hundreds of individual still photos takes a few minutes if you’re moving quickly. And asking someone to stay perfectly motionless for minutes when they had other plans, and aren’t being paid, is completely unreasonable.

But if you’re shooting video, you can create 300 still images in just 10 seconds. Asking someone if they’re willing to stand still for 10-15 seconds is a much easier sell than 2-3 minutes.

But that means that you need to know how to get every angle you need in a quick and steady 15-second motion, with known ways to compensate for whatever pose the subject may take.

We tried a number of movements, but the one that worked the most consistently was the simplest: a 270-degree orbit of the subject, starting behind their left foot, crossing in front of them and ending behind their right foot, filming along a flat-plane.

How do we capture large poses?
Capturing large poses was a new challenge, but not a daunting one.

Before preparing for the C2E2 story, we had been mostly modeling objects and environments– and the few people we had rendered were closeups only.

We immediately began practicing capture with a wider arc to frame entire people, and then more complicatedly, people who were posing (as we expected would occur at C2E2).

As we began to step back to capture larger poses, we saw more and more strange blobs/artifacts in the models, and we quickly realized that these were related to the environment where we were capturing.

What kind of environments will yield us consistently acceptable results?
Initially we chose controlled environments that had no movement in sight, but that often left us in cramped hallways with little room to walk and lighting that cast hard shadows. We needed much larger spaces to walk around our subject if we wanted to capture from head-to-toe. And we needed much better lighting.

Over time we realized that we only need a controlled environment for the space between our camera and our subject. This meant we could position our subject in large, well-lit rooms, even when there were lots of people around, so long as we could keep the crowds of people from walking through our capture radius for 15 seconds.

Once we knew how to position ourselves and capture our posing subjects quickly, we were as ready as we could be. We went to C2E2 for the whole weekend and captured 41 different cosplayers. You can find our full collection of cosplay models here.

Software
Once we returned from the convention with a lot of cosplayer videos, we needed to turn them into 3D models.

The first step was to take each video, and extract the individual frames into separate .tiff format picture files. We used Adobe Premiere to do this.

We explored two different pieces of photogrammetry software: Reality Capture and PhotoScan.

After some comparative testing, we found that we were getting better models, in much less time, using Reality Capture, so that’s what we used to produce everything in our C2E2 article and seen above. (Read more about our testing in our photogrammetry tools review.)

Finally, we hosted all of our finished models on the website sketchfab so that we could easily embed them into our articles.

Presentation
Going in, We knew that this article wasn’t going to be an all-encompassing virtual narrative, so we made sure to quickly interview each person that we modeled. Because we interviewed each person, and kept ourselves open to what we heard, we could consider the quality of each model, as well as the quotes that we’d received, when determining how to structure our narrative arc.

We knew that this story was going to remain about the cosplayers and their passion. We decided to open with High School student Morgan because, while her finished model wasn’t the best, she had a great, relatable story for why she was there. And as a young amateur, she was a great foundation to arc from youth to amateur, competitor, champion, to professional who wants to inspire more youth.

We wanted to focus on show, don’t tell, so we kept the rest of the story minimal.

Current Limitations
One disappointing limitation was difficulty incorporating audio into the story. We wanted you to hear from each cosplayer when you saw them, but that functionality wasn’t readily available.

Additionally, lots of very interesting cosplayers simply didn’t reconstruct as good 3D models a variety of reasons. I hope this can help you for more information about the 3D Modeling you can visit this site https://www.impkeys.com/product-category/autodesk/ :slight_smile:

Nice post, very comprehensive! I’ll admit I didn’t read the whole thing through yet, but will do so this evening when I have some time to digest it. A few initial comments in bringing what we are doing to VR and Photogrammetry in general.

It is already very easy to bring drone models from photogrammetry into VR. What is lacking is the quality of the model. Our models do not need to be alive, as they are merely scenery such that is already being used in gaming, but there are some things that can be done to wake it up if you choose and I think most of those things lie in the BIM virtualization world and labs like yourself. We can do some of it, but we work in mass shapes and textural content is not of much use. This is where you would need to bring in the rendered Architectural model.

Something that is going to be a hurdle that is very different about our realm vs other VR platforms is that they are in a world of the own. Very rarely do I see VR or BIM content that is actually georeferenced. Being georeferenced and geometrically accurate to the centimeter level causes our data to explode in size as it becomes more realistic and meshes become more dense.

Something that is changing in the drone and photogrammetry world is cameras. They are about to make a huge jump in our format with regards to quality and we also need to start thinking about multi-spectral and infrared and how that can be used in modeling and analysis. Most of things are already available, but are cost prohibitive and not yet ready for the masses. Process need to be tightened on both the front and back ends. Adding this content will lead to a further explosion of data size and thence will require exponentially more power to process. Local processing will have to go away unless publicly available hardware makes a huge leap and cloud processing is going to explode again.

One thing I have been discussing with my area mapping peers is creating an effort in conjunction with local municipalities to map entire cities and surrounding areas with better than decimeter accuracy. As and example the City of Round Rock at about 36 square miles could be mapped collectively in a little less than two weeks. These small thriving suburbs are closing the gap around the major cities and would be a great space to work in virtual reality. Add the BIM models that are being made for virtually every building construction project now days and you could get some very realistic content. I’d love to play COD in my city!

I’ll be back after I have read more in depth. Thanks!