I'm doing a small project for motion detection on some pretty restrained hardware (an old Packard Bell that my cousin had in her closet). I want to detect motion in a series of images taken from a video stream. Unfortunately, the noisy webcam is leading to some pretty garbage results in low light conditions when using a count of changed pixels as the threshold for motion.
My first thought for resolving this is to get the x and y positions of the changed pixels and see if their position is more than 'n' standard deviations away from the average. From there, I could exclude everthing outside of a certain threshold of 'n' as noise and only use the closest cluster as the detection. The problem with this is that I don't really have the computational resources to pull this operation every frame. I guess I could just get the sdev every few hundred frames, but even that is a bit much.
Another option would be to throw a blur filter over the image, but that may end up being more computationally intensive. What do you guys think would be the best move?
I'd definitely go with the blurring approach (linear), but the two approaches are going to end up being almost equivalent. Also why not just limit the frequency of frame operations by the speed of the operation itself? So instead of fixing it to run after every n frames, sample the frame once the previous frame operation finishes. You may find that you could get by with a lower rate.
In case my last sentence wasn't clear in other words, define the time period as the time it takes to process one frame, rather than the time it takes for a new frame to arrive.
Luke Scott
It might also help to take a distance between the old and now colors to filter out noise, but this does increase computation time). (Note that you DON'T want to use Euler distance over RGB).
Aaron Miller
stddev is trash made by mathematicians too lazy to deal with absolute values. Use the average absolute deviation
I find that hard to believe. Old P3 running at 600 MHz can do 2054 MIPS. At resolution of 640x480 you need to process 307200 pixels. For stddev you need to first get an average of a whole image which will take 307200 additions and division by N. After this you need to do subtraction and multiplication for each pixel using up 2 * 307200 instructions. Then you divide by N and do a square root, which on first guess shouldn't take more than 1000 instructions. In total calculating stddev took around 0.5 ms, which means it's possible to calculate stddev at 2226 fps. Even if I'm off by factor of 100 that's still 22 fps. I don't see the reason why calculating stddev every frame is not possible, unless you're capturing at 1080p, which would be stupid. What language are you using?
Hunter Campbell
Node.
Luis Russell
This is as retarded as the movement to replace pi with tau = 2 pi.
Node isn't a language. If you're actually OP, you deserve to fail for using such crap in a low-level domain.
I've heard this before but I dont really get it, whats wrong with euler distance for rgb space? What distances should I use?
Levi Brown
It's the same reason you can't generate a random color by just choosing a random red, green, and blue value. If you try this out yourself, it should be very appearant that the color isn't quite random.
Parker Collins
Random color != Random hue. What you describe is in fact the correct solution to generating a random color within the gamut. Wanting a "slightly less random color of same apparent brightness and saturation because my caveman eyeball cant tell the difference", is a different problem with a different solution.
Jeremiah Jackson
Use a real computer to develop robust latent classifier models of motion and non-motion video and then feed those models into your shitty computer. With enough "non-motion" examples and feedback, the model can learn pretty fast what dynamic random noise looks like, and learn to ignore it.
Alexander Bennett
Has anyone even asked what OP is trying to do? Is anyone addressing his performance bottleneck? I assume he just wants to make a shitty intruder detection system with an old computer and webcam. If that's the case, it really shouldn't be too hard. Let me elaborate on what I said here:
1. Determine the size of the smallest moving object you need to detect, the bigger the better. (50 pixels wide?) 2. Downsample the image to a smaller buffer by averaging blocks of that size (1000x1000 / 50x50 pixels = 20x20 pixels ) 3. Subtract this downsampled buffer from the previous downsampled buffer, element-wise, to get the change per frame for each pixel. 4. Determine some threshold of change that will trip the system.
I assume the webcam is fairly hi-res. If you're going to downsample a lot, it may be better to randomly sample each chunk a few times, rather than sum the entire thing. Other than that it really shouldn't be an expensive operation. Of course this assumes what your intended use is. What are you trying to do OP?
Bentley Collins
The camera is 640x480. I considered your technique, but the issue is that I have a range of distances I want to look at. Range 1 is the porch. Range 2 is the yard. Range 3 is a road beyond the yard. Because of this, I'd have to work multiple layers of the image.
I'll upload a 640x480 picture in a minute to give an idea of what I'm wanting to look at.
Nicholas Scott
This guy has a good idea of dividing the image into chunks. If your image is 640x480 then 40x40 pixel chunks will divide it into 16x12 chunks. I'm not so sure I agree with the idea of down sampling by averaging chunks. It may work but if it is too costly, then go with subsampling by checking the center square of nxn pixels for each 40x40 block. For example, maybe the center 2x2 pixels. Thus you would be checking 16*12*4 = 768 pixels. Then for each 2x2 center square, keep track of the running median (for some time window size of n frames) of the sum of those pixels, perhaps with a skiplist rhettinger.wordpress.com/tag/running-median/ Use this to calculate Median Absolute Deviation (MAD) every few frames. The idea is to have the algorithm that is running be light enough that it can still run fully when something is detected and you need to add on more computations. Once MAD detects an outlier in one or more of the center 2x2 squares of some blocks, increase the inner square size to 4x4 for those blocks. To be able to do this means you need previous image frames as far back as n (the window size) stored in main memory and you would need to pull required pixels up. Also, stop deleting old frames at this point so you have a window or partial window of old frames without the intruder in them as reference. Once the blob of outliers is determined to be of critical mass, trigger some alarm functions that send a text message to you or something. Also, the critical mass should exist for a certain period of time to avoid being triggered by a light flash (lightning or passing car). One downside to this method is that an intruder could get past it by moving very very slowly.
Jace Allen
Just thought of how to get around the slo-mo intruder. Use every 1000th frame in calculating your running median, and store every 1000th frame for the n frames making up your moving window size. So you will be checking the center 2x2 pixel square for every block for every frame, but only storing the new pixel sum values if it is the 1000th frame since the last was stored, and deleting the first in the current window.
Jaxon Ramirez
Also, if a data structure like a slip list is too big to use for each block of pixels, then rather then getting the median, just use a rolling mean which just requires storing a window of n values and the current sum, S. New sum is S = S-x+y where x is oldest value and y is newest. Then mean = S/n. This method will be more sensitive to outliers though.
Isaiah Wood
Forgot to say, that the running mean will be constant time calculation for each frame while the running median will be log n. So probably just go with Average Absolute Deviation.