We present a real-time approach for acquiring 3D objects with high fidelity using hand-held consumer-level RGB-D scanning
devices. Existing real-time reconstruction methods typically do not take the point of interest into account, and thus might fail
to produce clean reconstruction results of desired objects due to distracting objects or backgrounds. In addition, any changes
in background during scanning, which can often occur in real scenarios, can easily break up the whole reconstruction process. To address these issues, we incorporate visual saliency into a traditional real-time volumetric fusion pipeline. Salient regions detected from RGB-D frames suggest user-intended objects, and by understanding user intentions our approach can put more emphasis on important targets, and meanwhile, eliminate disturbance of non-important objects. Experimental results on real world scans demonstrate that our system is capable of effectively acquiring geometric information of salient objects in cluttered real-world scenes, even if the backgrounds are changing.