Thursday, July 25, 2019

TLD Implementation in Python With Explanation


#importing cv2 and system package
import cv2
import sys
#So how do you ensure that your code will work no matter which version of OpenCV your production environment is using
# Extract major, minor, and subminor version numbers
(major_ver, minor_ver, subminor_ver) = (cv2.__version__).split('.')
#Every Python module has it’s __name__ defined and if this is ‘__main__’, it implies that the module is being run standalone
#by the user and we can do corresponding appropriate actions.
if __name__ == '__main__' :
    # Set up tracker.
    tracker_type = 'TLD'
    if int(minor_ver) < 3:
        tracker = cv2.Tracker_create(tracker_type)
    else:
        if tracker_type == 'TLD':
            tracker = cv2.TrackerTLD_create()
    # Read video
    video = cv2.VideoCapture("./videos/chaplin.mp4")
    # Exit if video not opened.
    if not video.isOpened():
        print("Could not open video")
        sys.exit()
    # Read first frame.
    ok, frame = video.read()
    if not ok:
        print('Cannot read video file')
        sys.exit()
    #rectangular region of interest (ROI)
    #Let’s start with a sample code. It allows you to select a rectangle in an image,
    #crop the rectangular region and finally display the cropped image.
    bbox = cv2.selectROI(frame, False)
    # Initialize tracker with first frame and bounding box
    ok = tracker.init(frame, bbox)
    file=open("Coordinate.txt","w")
    while True:
        # Read a new frame
        ok, frame = video.read()
        if not ok:
            break
   
        # Start timer
        timer = cv2.getTickCount()
        # Update tracker
        ok, bbox = tracker.update(frame)
        # Calculate Frames per second (FPS)
        fps = cv2.getTickFrequency() / (cv2.getTickCount() - timer);
        frame_count = int(video.get(cv2.CAP_PROP_FRAME_COUNT))
 
        # Draw bounding box
        if ok:
            # Tracking success
            p1 = (int(bbox[0]), int(bbox[1]))
            p2 = (int(bbox[0] + bbox[2]), int(bbox[1] + bbox[3]))
     
            cv2.rectangle(frame, p1, p2, (255,0,0), 2, 1)
        else :
            # Tracking failure
            cv2.putText(frame, "Tracking failure detected", (100,80), cv2.FONT_HERSHEY_SIMPLEX, 0.75,(0,0,255),2)
        #cv2.putText(img, text, position, font, fontScale, color, thickness, lineType, bottomLeftOrigin)
 
        # Display tracker type on frame
        cv2.putText(frame, tracker_type + " Tracker", (100,20), cv2.FONT_HERSHEY_SIMPLEX, 0.75, (50,170,50),2);
 
        timer=(cv2.getTickCount()-timer)/cv2.getTickFrequency()
 
        #Display X and Y Coordinate
        cv2.putText(frame,"X and Y Coordinate "+str(p1)+" and "+str(p2), (100,70), cv2.FONT_HERSHEY_SIMPLEX, 0.75, (50,170,50),2);
 
        file.write(str(timer)+" :: UpperLeft(x,y) and BottomRight(x,y) "+str(p1)+" and "+str(p2)+"\n")
     
        # Display FPS on frame
        cv2.putText(frame, "FPS : " + str(int(fps)), (100,50), cv2.FONT_HERSHEY_SIMPLEX, 0.75, (50,170,50), 2);
        # Display result
        cv2.imshow("Tracking", frame)
        # Exit if ESC pressed
        k = cv2.waitKey(1) & 0xff
        if k == 27 :
            file.close()
            break

Credit : Github
Note : Code has taken from Github and modified according to need . 

Tuesday, July 9, 2019

FAQ about UOH MTech Admission

Can MTech CS or IT or IS student learn AI ?

Definitely you can learn anything you want .

Can MTech CS or IT or IS guys take AI electives ? 

Yes ,  but not all some electives they can take .

How many electives student can take in first and second semester ?

It depends on the course , generally 3 electives per semester and 2 mandatory core subjects.

Number of Food canteens  in UOH ?

Let divide UOH in two parts :- North campus and south campus .
In North Campus following canteens are their

1.Student canteen
2.Goaps
3.F canteen
4.North Shop Complex
5.Chemistry canteen
6. Night Canteen (Its Timing is 10pm - Morning)



In South campus following canteens are their

1. South Shop Complex
2. Hotel , Dhabas are available outside the South Campus.

Visiting places inside campus ?
Number of places are their to visit . Below playlist contains more then 15+ videos about UOH campus .


From where to purchase bed sheets , bucket , mug etc .
These things are not available  inside the Campus . Just outside the south campus there are shops by where you can purchase all these things .

What is level of Assignment here ?
It depends on the faculty . Generally you need to do it by own . If you found copy by others , marks will distribute equally .

What about labs here ?
Beside the 5 subjects , two labs are in first semester and one lab in second second .
Two labs in first semester are
1. DSP Lab (Data Structure and Programming Lab)
2. Algorithm Lab

From where to purchase cycles?
For new cycle , you need to go outside campus .
For old cycle , contact seniors .

Sports Facility in UOH
Sports Complex is in North Campus .

What is timing of classes ?
General it is from 9:30 AM to 6:30 PM . But You will get break between classes . 
They are not for whole a day .

What about holidays ?
 Saturday and Sunday no classes . Other holidays are mentioned in Academic Calendar .

What about the middle terms exam here ?
Generally two minors are here , some faculty takes three minors (Out of  three , best two they consider) . Total minor marks are 40 and External exam mark is 60 in each subject .

From Which month I will get Gate Stipend ?
From December or January , you will receive your  first gate stipend .

Is 75%  attendance mandatory  for gate stipend ?
Strictly , it is mandatory .
Note: 75% is overall in all subjects .

From where to learn AI ?
Andrew Ng ML course is available on  Coursera and YouTube
Geoffrey Hinton  course is also available on YouTube.
You can purchase other AI Course like Applied AI etc.

If you have any query please write to us at
hemjoshi745@gmail.com  or do whatsapp 9675467414

If you like this article , subscribe our youtube channel . 


Monday, July 8, 2019

TLD (Tracking , Learning and Detection) : Complete Overview with Python Code


Image Credit : Google 

In this blog, we will learn about object tracking using the TLD. TLD stands for Tracking , Learning and Detection .  

What is an Object tracking?

Locating an object in successive frames of a video is called object tracking. More about object tracking click here . There is another term Object Detection.


Object detection is the task of localization of objects in an input image. The definition of an “object” vary. It can be a single instance or a whole class of objects.

Object detection methods are typically based on the local image features or a sliding window. More about Object Detection Click here 

How can we do object tracking?

There are many algorithms for object tracking. However, every algorithm has their own pros and cons. In this article, we will discuss about TLD algorithm only. 

Pros of TLD

1. It works better in occlusion.
2. TLD is good at learning the appearance of the object

Cons of TLD

1. Does not work better when object rotates about 90 degree or more.
2. Object disappear in frame.

Acquisition: an asset or object bought or obtained. 

TLD works as, we need to mark first frame using a rectangle to indicate the location of the object we want to track. The object is then tracked in subsequent frames using the tracking algorithm. 
First, we define our goal.
Objective: Given a bounding box defining, the object of interest in a single frame, our goal is to automatically determine the object is bounding box or indicate that the object is not visible in every frame that follows.
The video stream is to be processed at frame-rate and the process should run indefinitely long. We refer to this task as long-term tracking.

Frame - rate:
Frame rate is the frequency at which consecutive images called frames appear on a display.
1.            Detection of the object when it reappears in the camera’s field of view.
2.            Handle scale and illumination changes
3.            Handle background clutter
4.            Handle partial occlusions 
5.            Operate in real-time 

The long-term tracking can be approached either from tracking or from detection perspectives.

More about TLD

1. The tracker follows the object from frame to frame. 


2. The learning estimates detector’s errors and updates it to avoid these errors in the future. 

3. The detector localizes all appearances that have been observed so far and corrects the tracker if necessary.

The Block Diagram of the TLD framework is below.

The starting point of our research is the acceptance of the fact that neither tracking nor detection can solve the long-term tracking task independently. 

Why Tracking and Detection altogether?

However, if they operate simultaneously, there is potential to benefit one from another.

1. A tracker can provide weakly labeled training data for a detector and thus improve it during run-time.

2. A detector can re-initialize a tracker and thus minimize the tracking failures.

3.  Each sub-task is addressed by a single component and the components operate simultaneously. 

4. The tracker follows the object from frame to frame. The detector localizes all appearances that have been observed so far and corrects the tracker if necessary. 

5. The learning estimates detector’s errors and updates it to avoid these errors in the future.

More about TLD Framework 

1. TLD is a framework designed for long-term tracking of an unknown object in a video stream.

2. Tracker estimates the object’s motion between consecutive frames under the assumption that the frame-to-frame motion is limited and the object is visible.

3. The tracker is likely to fail and never recover if the object moves out of the camera view.

4. Detector treats every frame as independent and performs full scanning of the image to localize all appearances that have been observed and learned in the past.

Framework means a basic structure underlying a system, concept, or text.

5. Learning observes performance of both, tracker and detector, estimates detector’s errors and generates training examples to avoid these errors in the future. The learning component assumes that both the tracker and the detector can fail.

6 .By the virtue of the learning, the detector generalizes to more object appearances and discriminates against background.

We have talked about Tracking and Detection. Now we will talk about learning portion  

In TLD, we use PN - Learning. More precisely, you can say, P -Expert and N-Expert Learning.

1. P-N learning estimates the errors by a pair of “experts”: 

2. P-expert estimates missed detections, and 

3. N-expert estimates false alarms 

The learning process is modeled as a discrete dynamical system and the conditions under which the learning guarantees improvement are found.


P-expert 

1. P-expert exploits the temporal structure in the video and assumes that the object moves along a trajectory.

2. The P-expert remembers the location of the object in the previous frame and estimates the object location in current frame using a frame-to-frame tracker.

3. If the detector labeled the current location as negative (i.e. made false negative error), the P-expert generates a positive example.

4. The goal of P-expert is to discover new appearances of the object and thus increase generalization of the object detector.

5. P-expert can exploit the fact that the object moves on a trajectory and add positive examples extracted from such a trajectory

6. In every frame, the P-expert outputs a decision about the reliability of the current location (P-expert is an online process). If the current location is reliable, the P-expert generates a set of positive examples that update the object model and the ensemble classifier.

N- Expert 

1. N-expert: exploits the spatial structure in the video and assumes that the object can appear at a single location only.

2. The N-expert analyzes all responses of the detector in the current frame and the response produced by the tracker and selects the one that is the most confident. 

3. Patches that are not over lapping with the maximally confident patch are labeled as negative. The maximally confident patch re-initializes the location of the tracker

4. N- expert generates negative training examples. Its goal is to discover clutter in the background against which the detector should discriminate. 

5. The key assumption of the N-expert is that the object can occupy at most one location in the image. Therefore, if the object location is known, the surrounding of the location is labeled as negative.

6. N-expert is applied at the same time as P-expert, i.e., if the trajectory is reliable.
For the update of the object detector and the ensemble classifier, we consider only those patches that were not rejected by either the variance filter or the ensemble classifier.







Object Detection

Image Credit : IEEE Explore

Object detection is the task of localization of objects in an input image.
The definition of an ”object” vary. It can be a single instance or a whole class of objects.
Object detection methods are typically based on the local image features or a sliding window.
The feature-based approaches usually follow the pipeline of:
  (i) feature detection,
 (ii) feature recognition, and
 (iii) model fitting

Exploiting the fact that background is far more frequent than the object, a classifier is separated into a number of stages, each of which enables early rejection of background patches thus reducing the number of stages that have to be evaluated on average. 
Training of such detectors typically requires a large number of training examples and intensive computation in the training stage to accurately represent the decision boundary between the object and background.
An alternative approach is to model the object as a collection of templates. In that case the learning involves just adding one more template .

Object tracking…

Image Credit : Google 


What is Object Tracking ?

Locating an object in successive frames of a video is called tracking .

ØObject tracking is the task of estimation of the object motion.
ØTrackers typically assume that the object is visible throughout the sequence.
ØHere we focus on the methods that represent the objects by geometric shapes and their motion is estimated between consecutive frames , i.e. these-called frame to-frame tracking.
ØTemplate tracking is the most straightforward approach in that case. The object is described by a target template (an image patch, a color histogram) and the motion is defined as a transformation that minimizes mismatch between the target template and the candidate patch.
Ø Templates have limited modeling capabilities as they represent only a single appearance of the object.
Ø To model more appearance variations, the generative models have been proposed.
ØThe generative models are either build offline or during run-time .
ØThe generative trackers model only the appearance of the object and as such often fail in cluttered background.
Ø In order to alleviate this problem, recent trackers also model the environment where the object moves.


Behavior Recognition System Based on Convolutional Neural Network

Our this article is on this  research paper .  Credit : Bo YU What we will do ? We build a set of human behavior recognition syste...