a gaussian process guide particle filter for tracking 3D human pose in video
更新时间:2023-05-03 09:55:01 阅读量: 实用文档 文档下载
4286IEEE TRANSACTIONS ON IMAGE PROCESSING,VOL.22,NO.11,NOVEMBER2013
A Gaussian Process Guided Particle Filter for
Tracking3D Human Pose in Video Suman Sedai,Mohammed Bennamoun,Member,IEEE,and Du Q.Huynh,Member,IEEE
Abstract—In this paper,we propose a hybrid method that com-bines Gaussian process learning,a particle?lter,and annealing to track the3D pose of a human subject in video sequences. Our approach,which we refer to as annealed Gaussian process guided particle?lter,comprises two steps.In the training step, we use a supervised learning method to train a Gaussian process regressor that takes the silhouette descriptor as an input and produces multiple output poses modeled by a mixture of Gaussian distributions.In the tracking step,the output pose distributions from the Gaussian process regression are combined with the annealed particle?lter to track the3D pose in each frame of the video sequence.Our experiments show that the proposed method does not require initialization and does not lose tracking of the pose.We compare our approach with a standard annealed particle?lter using the HumanEva-I dataset and with other state of the art approaches using the HumanEva-II dataset. The evaluation results show that our approach can successfully track the3D human pose over long video sequences and give more accurate pose tracking results than the annealed particle ?lter.
Index Terms—3D human pose tracking,Gaussian process regression,particle?lter,hybrid method.
I.I NTRODUCTION AND M OTIVATION
I MAGE and video-based human pose estimation and track-
ing is a popular research area due to its large number of applications including surveillance,security and human computer interaction.For example,in video-based smart sur-veillance systems,3D poses can be used to infer the action of a subject in a scene and detect abnormal behaviors.It can provide an advanced human computer interface for gaming and virtual reality applications.Also,3D poses of a person computed over a number of video frames can be useful for biometric applications to recognize a person.These applica-tions require simple video cameras or still images as input and,as a result,provide a low cost solution in contrast to marker-based systems.
Manuscript received January5,2012;revised September15,2012and March13,2013;accepted June12,2013.Date of publication July4,2013; date of current version September12,2013.This work was supported in part by the ARC Discovery Project under Grant DP0771294.The associate editor coordinating the review of this manuscript and approving it for publication was Prof.Nikolaos V.Boulgouris.
S.Sedai was with the University of Western Australia,Crawley,WA6009, Australia.He is now with IBM Research Australia,Melbourne3000,Australia (e-mail:ssedai@0ca8f1da71fe910ef02df81c).
M.Bennamoun and D.Q.Huynh are with the University of Western Australia,Crawley,WA6009,Australia(e-mail:mohammed.bennamoun@ 0ca8f1da71fe910ef02df81c.au;du.huynh@0ca8f1da71fe910ef02df81c.au).
Color versions of one or more of the?gures in this paper are available online at 0ca8f1da71fe910ef02df81c.
Digital Object Identi?er10.1109/TIP.2013.2271850
Human pose estimation and tracking systems can be broadly classi?ed into three different approaches:discriminative, generative and hybrid methods.
In generative methods,the output pose is estimated by searching the solution space for a pose that best explains the observed image features[1],[2].In this approach,a generative model is constructed which measures how close the hypothesized pose is to the observed image features.A hypothesized pose that is most consistent with the observed image features is chosen as the output pose.Particle?lter[2], [3]is a generative tracking method to estimate the pose in each frame using an estimate of the previous frame and the motion information.Given an image of a human subject,there are multiple3D human poses associated with the image.Such pose ambiguities can be resolved using images from multiple camera views[4].
In discriminative methods,a regression function that maps image features to the pose parameters is obtained using supervised learning[5],[6].Although discriminative methods can quickly estimate the3D pose,they can produce incorrect predictions for new inputs if the system is trained using small datasets.Moreover,the relationship between image features and the human pose is often multimodal.For example,when the human silhouette is used as an image feature,one sil-houette can be associated with more than one pose,resulting in ambiguities.In such cases,multiple discriminative models are needed to build one-to-many relationships between image features and poses[5],[7].
Discriminative methods are powerful for speci?c tasks such as human pose estimation as they are only based on the mapping from the input to the desired output.The generative methods,on the other hand,are?exible because they provide room for using partial knowledge of the solution space and exploit the human body model to explore the solution space. Due to their different ways of predicting the?nal output,the two methods are considered to complement each other.Thus hybrid generative and discriminative methods have shown to have the potential to improve pose estimation performance.As a result,they have gained more attention recently.
In hybrid methods,such as the ones presented in Refs.[8], [9],a discriminative mapping function is used to generate a pose hypothesis space.The pose is then estimated by searching the hypothesis space using a generative method.The method of Ref.[8]is only based on pose estimation from a single image and is therefore unable to track the pose from a video sequence.Moreover,these methods assume that the pose hypothesis space generated from the discriminative model is
1057-7149?2013IEEE
SEDAI et al.:GAUSSIAN PROCESS GUIDED PARTICLE FILTER4287
always correct and hence fail to handle the case when the discriminative model predicts incorrect poses.
In this paper,we propose a hybrid discriminative and generative method to track the3D human pose from both single and multiple cameras.For the discriminative model,we use a mixture of Gaussian Process(GP)regression models, which are obtained by training GP models in different regions of the pose space.The GP regression has the advantage of being able to give a probabilistic estimate of the3D human pose.It provides an effective way of incorporating more con?dent discriminative predictions into the tracking process while discarding the uncertain ones.To the best of our knowledge,this is the?rst hybrid method that takes into account the predictive uncertainty of the discriminative model and combines it with a generative model to improve pose estimation.We treat the probabilistic output from the GP regression as one component of the hypothesis space.In the tracking step,we combine this hypothesis space with the hypothesis space obtained from the motion model and the search for the optimal pose is performed using an annealed particle?lter.
A major contribution of this paper is the introduction of a novel method to combine Gaussian Process regression and annealed particle?lter for3D human pose tracking. Our method can probabilistically discard uncertain predictions from the regression model and hence only use predictions that are likely to be correct to track the3D pose.Moreover,our method can resolve ambiguities when motion information is used along with the image cues from multiple views during pose tracking.
The organization of the paper is as follows.Section II presents the related work.Section III presents the details of the training of the discriminative model.Section IV presents the proposed AGP-PF method for3D human pose tracking. Section V provides the experimental results and Section VI concludes the paper.
II.B ACKGROUND AND R ELATED W ORK
In discriminative estimation,a direct mapping between image features and the pose parameters is learned using a training dataset.Examples of discriminative learning includes nearest neighbor based(example based)regression[6]and sparse regression[10],[11].Often the relationship between the image features and the pose space is multimodal.The multimodal relationship is established by learning the one-to-many mapping functions that are de?ned in terms of a mixture of sparse regressors[5],[7]or Gaussian Process regression[12],[13].This setting produces a multiplicity of the pose solutions that can either be ranked using a gating function[14]or be veri?ed using an observation likelihood [8]or be disambiguated using temporal constancy[11].For effective pose estimation using a discriminative approach,it is important that image features are compact and distinctive. Approaches such as[15],[16]use a metric learning method to suppress the feature components that are irrelevant to pose estimation.Recent research shows that feature selection and pose estimation can be carried out using regression trees[17]and random forest[18].Other methods such as[19],[20] use dimensionality reduction to make the feature vector more distinctive for pose estimation.The fusion of multiple image cues based on regression has also been used to improve the pose estimation performance[21],[22].In another approach [23],2D trajectories of the human limbs in a video are used as features and the mapping between the trajectory features and the3D poses is modelled using Gaussian Process regression. It has also been shown that pose estimation performance can be improved by taking into account the dependencies between the output dimensions,e.g.,using structural SVM[24]and Gaussian Process regression[25].Dimensionality reduction can also be used to address the correlation between output dimensions,e.g.,the training of mapping function is performed in a lower dimensional subspace of features and pose so as to make use of unlabelled data[26].
In generative inference,the pose that best explains the observed image features is determined from a population of hypothesized poses.A search algorithm is used to search the prior pose space for the pose that exhibits the maximum likelihood value for the observed image feature[1],[27]. Generative methods thus consist of three basic components:a state prior(pose hypotheses),an observation likelihood model (matching function),and a search algorithm.
The search algorithm can be local or global in nature.Most of the local search algorithms employ Newton’s optimization method[4],[28].Global search methods are based on stochas-tic sampling techniques where the solution space is represented by a set of points sampled according to some hypotheses. Successful global search algorithms include annealing[27], Markov Chain Monte-Carlo[29],covariance scaled sampling [1],and Dynamic Programming[30].In order to re?ne the pose tracking,the sampling based global search techniques have been combined with some local search techniques[31], [32].In order to reduce the search space,the pose prior space based on a prede?ned set of activities,such as walking and jogging,has been used[33].Pose priors based on models that constrain the human body pose to physically plausible con?g-urations have also been used to reduce the search space[34]. In order to compute the likelihood of a pose hypothesis, image features,such as silhouettes,edges[1],[27],color [29],[30]and learned features[35],have been used.First, the human body model corresponding to the hypothesized pose is projected onto the image feature space to obtain a model feature(a.k.a template feature).The likelihood value is computed as a similarity measure between the observed image feature and the template feature.For3D pose esti-mation,human body models based on cylinders[33],[36], superquadrics[37]and3D surfaces[4],[9]are commonly used.For2D pose estimation,a simple rectangular cardboard model has been used[38],[39].
To reduce the complexity of tracking in higher dimensions, prior models based on a lower-dimensional representation of the pose have been used.The commonly used models include linear models such as the Principal Component Analysis [33]and non-linear approximation such as the Mixture of Factor Analysers[40].In another approach[41],the Restricted Boltzmann Machine is used to model human motion in a
4288IEEE TRANSACTIONS ON IMAGE PROCESSING,VOL.22,NO.11,NOVEMBER2013
discrete latent space.A more commonly used non-linear dimensionality reduction technique for pose tracking is the Gaussian Process Latent Variable Model(GPLVM)[42],[43], which does not always provide a smooth latent space as required for tracking.To overcome this problem,a Gaussian Process Dynamical Model(GPDM)[44]which uses non-linear regression to model the motion dynamics in a latent space in a GPLVM framework can be employed.Tracking in the latent space often has a lower computational complexity;however, it has also been argued that models based on dimensionality reduction have limited capacity[45].
In the hybrid methods,discriminative and generative approaches are combined together to exploit their complemen-tary power to predict the pose more accurately.To combine these two methods,the observation likelihood obtained from a generative model is used to verify the pose hypotheses obtained from the discriminative mapping functions for pose estimation[8],[9].In other work,e.g.,[45],generative and discriminative models are iteratively trained using Expectation Maximization(EM).In each step of the EM,the predictions made by one model are used to train the other model. Recently,the work of Ref.[46]combines the discriminative and generative models by applying distance constraints to the predictions made by the discriminative model and verifying them using image likelihoods.None of these approaches utilizes the prediction uncertainty of the discriminative model to track the3D human pose.In this paper,we propose a hybrid method that takes into account the prediction uncertainty of the discriminative model and effectively combines it with the generative model.
We use Gaussian Process regression as the discriminative model.The probabilistic output poses from the discriminative model are integrated with a particle?lter and a subsequent annealing step for3D human pose tracking.Consequently,the pose tracking performance is improved,since the search space is reduced to only include the correct hypotheses generated from the discriminative model combined with the hypotheses of the motion model.This is an extension of our previous work[47],where we used a Relevance Vector Machine as a discriminative model combined with a particle?lter for 2D pose tracking.GP regression has been used in Ref.[48] to learn the dynamical model and the observation likelihood model for object tracking.However,their method does not involve a mapping from the feature space to the output space. Our method,on the other hand,uses GP regression to learn a discriminative model which gives a conditional distribution of the3D human pose.In this paper,we use a generic form of the motion model and observation likelihood model for tracking.In another purely discriminative?ltering-based approach[49],the unreliability of the observation is modelled using a probabilistic classi?er to regulate the predictions from multiple regressors.Our method,on the other hand,models the prediction uncertainty by the variance of the prediction from the Gaussian Process regressor and hence does not require a separate classi?er to model the unreliability.Furthermore, unlike[49],our method incorporates the image likelihoods obtained from the generative model to verify the pose hypothe-ses and to estimate the target state
distribution.Fig.1.A block diagram showing the training of the mixture of Gaussian Process regression models.
III.T RAINING S TEP
In the training step,we use supervised learning to construct a multimodal mapping between the shape descriptor space and the3D pose space.Once the mapping model is trained, the multimodal3D poses can be estimated using the trained model.A block diagram showing the training of our mixture of Gaussian Process regressors is shown in Fig.1.Given the training images,we?rst extract the silhouette images using background subtraction[50].Then,we extract the silhouette descriptors using Discrete Cosine Transform(DCT).We divide the3D pose space into K clusters and we train the mapping from the silhouette descriptor space to the3D pose space of each cluster.The?nal output of the training stage is a mixture of Gaussian Process regression models.
A.Feature and Pose Representation
A review of many shape and appearance descriptors that are applicable to discriminative pose estimation is available in Ref.[51].In this paper,we use the silhouette as an image feature and the Discrete Cosine Transform(DCT)of the silhouette as the shape descriptor.We use the DCT because it is simple to compute and yet more discriminative than other shape descriptors[51],[52].It is shown in Ref.[52]that the DCT descriptor outperforms other shape descriptors such as Histogram of Shape Contexts and Lipschitz embeddings for human pose estimation.
The DCT descriptor,which belongs to the family of orthog-onal moments,represents the silhouette image by the sum of two dimensional cosine functions of different frequencies characterized by the various coef?cients.The DCT has been popularly used for image compression.First a silhouette window is cropped from the foreground image obtained from background subtraction.As shown in Fig.2,the cropped image
SEDAI et al.:GAUSSIAN PROCESS GUIDED PARTICLE FILTER
4289
0ca8f1da71fe910ef02df81cputation of the DCT descriptor from a silhouette image.
is scaled to the size of H ×W and the DCT descriptor for the image window is computed as
M p ,q =
W ?1 x =0H ?1 y =0
f p (x )
g q (y )I (x ,y ),(1)
for p =0,...,W ?1;q =0,...,H ?1,
where f p (x )=αp cos {p π(x +0.5)/W };g q (y )=αq cos {q π(y +0.5)/H };and αp =√
(1+min (p ,1))/W and αq =√
(1+min (q ,1))/H .We take W =64and H =128pixels.This window size is the most commonly used size for human subjects that are in an upright pose.The DCT descriptor has a nice property that most of the rich information about the silhouette is encoded in just a few of its coef?cients.We empirically found that setting the descriptor to a 64dimen-sional vector (corresponding to 8rows and 8columns of the DCT matrix M )is suf?cient to represent each silhouette.In this paper,we assume that the subject is upright in the image.Although the DCT descriptor is not rotation invariant,it does not affect the pose estimation so long as the human subject is in an upright pose in the image.It is possible to handle athletic motions such as handstands and cartwheels by training a separate discriminative model for such activities and using a classi?er to select the most appropriate model for an input feature.Since the silhouette is centered and re-scaled to the standard size,the descriptor is invariant to translation and scale.
We represent each 3D pose as the relative orientations of the body parts in terms of Euler angles.We take the torso as the root segment and the orientation of the torso segment is measured with respect to a global coordinate system.The orientations of the upper arms are measured relative to the orientations of the torso.The orientations of the lower arms are measured relative to the upper arms.Relative orientations between the upper legs and the torso and between the lower and upper legs are de?ned in a similar manner.B.Mixture of Gaussian Process (MGP)Regressors
We use a supervised learning technique to estimate the 3D pose directly from the silhouette descriptor.We learn the piecewise mapping of Gaussian Process (GP)regressor [53]from the shape descriptor space x ∈R m to the 3D pose space y ∈R d using the training data samples T = x (i ),y (i ) ,for i =1,...,N ,where N is the number of training samples.With such a mixture of Gaussian Process (MGP)regression
model [54],the GP regressors are trained for each region of
the data space and a separate classi?er is used to select the GP model that is appropriate for an input feature.First,the 3D pose space is partitioned into K clusters using the hierarchical k-means algorithm.The training set is then divided into K subsets,T 1,...,T K ,such that x (i ),y (i ) ∈T k if y (i )belongs to the k th cluster.We assume that the components of the output vector y are independent of each other so we train the separate GP regression model for each output component of y =[y 1,...,y d ]T .Without loss of generality,we drop the subscript q in y q (which represents the q th component of y )and present only a one-dimensional Gaussian Process regression.In each cluster,the relationship between x (i )and each component y (i )of the training pose instance y (i )(the superscript (i )represents the i th training instance)is modeled by
y (i )=f k (x (i ))+ (i )
k ,(2)
where (i )k ~N (0,β?1
k )and the βk is the hyper-parameter representing the precision of the noise.From the de?nition of a Gaussian Process [53],the joint distribution of the output variables is given by a Gaussian:
p (y k |X k )=N (0,C k ),(3)
where y k = y (i ),...,y (N k ) T ,X k =
x (1),...,x (N k ) T for all y (i )∈k th cluster,and C k is a covariance matrix whose entries are given by C k (i ,j )=κ(x (i ),x (j ))+β?1δij .The covariance function κ(x (i ),x (j ))can be expressed as
κ(x (i ),x (j ))=θk ,1exp ?θk ,22 x (i )
?x (j ) 2 +θk ,3,(4)
where the parameters k = θk ,1,θk ,2,θk ,3,βk
are referred to as the hyper-parameters of the GP and δij is the Kronecker’s delta function.This covariance function combines the Radial Basis Function (RBF)and a bias term.The learning of the hyper-parameters is based on the evaluation of the likelihood function p (y k | k )and the maximization of the log likelihood using a gradient based optimization technique such as con-jugate gradient.The log likelihood function for a Gaussian process can be evaluated as
ln p (y k | k )=?12ln |C k |?12y T k
C ?1k y k ?N k
2ln (2π).(5)Once the hyper-parameters are trained,the next step is
to predict the output pose component y k ?for the unseen test feature vector x ?.This requires the evaluation of the predictive distribution p (y k ?|y k ,X k ).Let us de?ne y k ?=
y (1)
,...,y (N K ),y k ? T ,whose joint distribution is given by
p (y k ?)=N (0,C k ?),
(6)
where C k ?∈R (N k +1)×(N k +1)is a covariance matrix.That is,
C k ?= C k c k ?
c T k ?c k ,(7)where the vector c k ?has elements c k ?(i )=κ(x (i ),x ?),for
i =1,...,N k ,and the scalar c k =κ(x ?,x ?)+β?1
k
=
4290IEEE TRANSACTIONS ON IMAGE PROCESSING,VOL.22,NO.11,NOVEMBER 2013
θk ,1+θk ,3+β?1
k from Eq.(4).The conditional distribu-tion p (y k ?|y k ,X k )is a Gaussian distribution with mean and variance given by
y k ?=c T k ?C ?1
k y k
(8)σ2
k (x ?)=c k ?(c k ?)T C ?1k c k ?.
(9)
In this manner,we obtain the prediction for each component
of the pose vector.Let y q ,k ?be the prediction for the q th com-ponent of the pose vector and σ2q ,k (x ?
)be the corresponding variance.The full pose vector is y k ?=
y 1,k ?,...,y d ,k ? T and the corresponding covariance matrix becomes k (x ?)=
diag (σ21,k ?,...,σ2d ,k ?
).The multimodal relationship between the silhouette descriptor x and the 3D pose vector y is thus represented by a mixture of K regressors:
p (y |x ?)=
K k =1
g k (x ?)N (y k ?, k (x ?)),(10)
where g k (x )is a K -class classi?er which gives the probability
that the k th GP regressor is selected to predict the given feature instance x ?.We model the multi-class classi?er g k (x )as a multinomial logistic regressor (i.e.,the softmax function):
g k (x )=exp (?v T k x ) K
j =1exp ?(v T
j x )
,(11)
where v k is an m -dimensional parameter vector.The parame-ter vectors v 1,...,v K are estimated from the training data x (i ),l (i ) N i =1,where l (i )= l (i )1,...,l (i )K and l (i )
k
denotes the probability that feature vector x (i )belongs to the k th cluster.
We set l (i )
k
=1if y (i )belongs to the k th cluster,otherwise we set it to 0.The maximum likelihood estimation of the parame-ter vectors is then performed using the iteratively reweighted least squares method.We use a fast method based on bound optimization described in [55]to train these parameter vectors.
Instead of the multinomial logistic regressor,an alternative is to model the multi-class classi?er g k (x )as a function of the variance of the k th GP mode,i.e.,by setting g k (x )∝1/trace ( k (x )).As the average variance of the prediction is lower when the test feature is closer to the training samples,a higher weight is given to the cluster which is closer to the test feature vector.In our empirical evaluation shown in Table III,we found that the pose estimation performance of the MGP regressor is improved when the multinomial logistic regressor is used in comparison to the case when the function of the variance is used.
1)Pose Space Clustering:The motivation behind clustering the pose space and learning discriminative model in each cluster is to model the depth ambiguities introduced by the silhouettes.For that purpose,we cluster the pose space into six partitions using hierarchical k-means.At the ?rst level,the pose space is partitioned into four clusters representing poses which face forward,backward,left and right with respect to the camera by a careful initialization.At the second level,each cluster that represents the lateral pose (left or right)is further partitioned into two clusters.Figure 3shows the representative poses in each cluster along
with
pose space
C1
C2
C3
C4
C5
C6
Fig.3.Representative pose in each cluster obtained using hierarchical k-means.Each pose is rendered as a 3D cylindrical model with red denoting left limbs and blue denoting right limbs.Silhouettes that give rise to ambigu-ous poses are also shown (Figure best viewed in color).
the silhouettes that give rise to ambiguous poses.Clusters C1and C2model the forward-backward ambiguity as the silhouette S1could be generated from the poses that are both in C1and C2.Similarly,clusters C3and C4model the ambiguity associated with the silhouette S2;clusters C5and C6model the ambiguity associated with the silhouette S3.Other approaches address the ambiguities by sampling from the multimodal posterior obtained from a mixture of regressors models [5],[7].In another interesting approach [56],it is shown that the ambiguous poses can be distinguished by sampling from the posterior in a latent space that is shared by both the observation and the pose spaces.In their approach,the posterior in the latent space is obtained using the GPLVM model.
This approach of hard clustering could result in a poor prediction performance at the boundary of the clusters.How-ever,since we used the Gaussian Process regression as the discriminative model,such poor predictions could often be detected as they tend to produce larger prediction variances.In Section IV-C,we discuss the adoption of a probabilistic method to discard poor predictions made by the MGP regres-sion model during the tracking of the 3D pose.The method of Ref.[12]provides an interesting solution to the boundary problem in that a MGP regressor is trained on the subset of the data that is closest to the test feature.However,their method requires computing the GP hyper-parameters for each test image.
SEDAI et al.:GAUSSIAN PROCESS GUIDED PARTICLE FILTER
4291 Fig.4.A block diagram showing the testing of our proposed3D pose
Tracking System.The3D human pose for each video frame is predicted from
the mixture of GP regressions;The tracking incorporates the predicted pose,
prediction uncertainty,edges and silhouette observations using the AGP-PF
method.
IV.P OSE T RACKING
In the tracking step,our goal is to determine the3D
pose of a person in the video frames.The block diagram of
our proposed tracking method is shown in Fig.4.For each
image frame in a video,the human pose is predicted using a
Mixture of Gaussian Process(MGP)regressors(Section III).
The output of the MGP regressors is a mixture of Gaussian
distributions.The output distribution is taken as one com-
ponent of the hypothesis space.The other component being
the output pose distribution of the previous frame.The
output pose is then computed by searching the combined
pose hypothesis space using our proposed AGP-PF tracking
method.
Our method needs a human body model to compute the
likelihood of each pose hypothesis.In this paper,we use a
3D cylindrical model of the human body.Each body part is
represented by a tapered cylinder as shown in Fig.5.The
length of the body parts and the diameters of the cylinders are
assumed to be?xed for a given person and is initialized at the
beginning of the tracking.
We use human kinematic constraints to determine the
degrees of freedom(DOF)of the pose.The torso has six
degrees of freedom after incorporating the global translation
and rotation.The head,upper arms and upper legs have three
degrees of freedom each.The forearms have one degree of
freedom each(they are only allowed to rotate about their
Y axis)and the lower legs have two degrees of freedom
(they are allowed to rotate about their X and Y axes).Hence
the human body model shown in Fig.5has27degrees of
freedom.We enforce the joint angle limit by restricting the
variation of the angles to a kinematically possible range.The
kinematically possible angles are computed from the training
data.
In this paper,we use our proposed Gaussian Process guided
particle?lter for pose tracking.Below,we will?rst give
a review of a standard particle?lter(for completeness)in
Section IV-B followed by our proposed Gaussian Process
guided particle?lter for pose tracking in Section
IV-C.
Fig.5.A3D human body model in a neutral pose.Each body part of the
model is approximated by a tapered cylinder.The X-axis is denoted by a
large circled dot and is perpendicular to both the Y-and Z-axes.
A.Likelihood distribution
Given an image observation denoted by r t at time t,the
likelihood density p(r t|y t)measures how well a hypothesized
pose vector y t explains the image observation r t.In our
method,the likelihood value is computed by matching the3D
human body model corresponding to the hypothesized pose
with the observed image features.We use the silhouette and
edge features of the image to compute the likelihood of each
pose state vector.We?rst project the3D cylindrical model
corresponding to the hypothesized pose onto the image plane
using the camera calibration matrix to obtain the hypothesized
edge features and the hypothesized region features.The likeli-
hood value is computed by matching the hypothesized features
with the observed image features.We use the silhouette and
edge features of the observed image to compute the likelihood
based on two cost measures:silhouette cost and edge cost,as
detailed below.
1)Silhouette Cost:The silhouette cost measures how well
the region projected by the hypothesized model?ts into the
observed silhouette.Given a3D pose hypothesis,we?rst
generate a binary image H of the corresponding3D human
body model such that H(x,y)=1if the pixel corresponds to
the hypothesized foreground and H(x,y)=0otherwise.An
example of a hypothesized foreground is shown in Fig.6(d).
Let Z be the observed silhouette image as shown in Fig.6(b).
The part of the silhouette Z that is not explained by the model
region H is given by R1=Z∩ˉH,where∩denotes the pixel-
wise“and”operator andˉH denotes the inverted image of H.
Similarly,the part of the model region that is not explained by
the silhouette is given by R2=H∩ˉZ.Since our objective is
to minimize these unexplained silhouette and model regions,
the cost can be expressed as
C sil=0.5
Area(R1)
Area(Z)
+Area(R2
)
Area(H)
,(12)
4292IEEE TRANSACTIONS ON IMAGE PROCESSING,VOL.22,NO.11,NOVEMBER
2013 Fig.6.(a)Input test image(b)Observed silhouette image Z(c)3D human
body model for a hypothesized pose(d)Projected silhouette image H obtained
from the model(e)H superimposed on Z(f)An example of the visible model
edge points projected into the observed edge image.
where Area(I)gives the number of non-zero pixels in
the binary image I.When the hypothesized model?ts
the observed silhouette exactly then both Area(R1)=
Area(R2)=0and hence C sil will give the lowest cost of
0.When the two regions have zero overlap then no part of the
silhouette region is explained by the model(i.e.,Z∩ˉH=Z)
and no part of the model region is explained by the silhouette
(i.e.,H∩ˉZ=H).In this case,C sil will give the highest cost
value of1.This measure of the silhouette cost is similar to
that of[57],[58].
2)Edge Cost:The edge cost measures how well the bound-
ary line corresponding to the hypothesized model?ts with the
observed edge image.Given an input image,we detect edges
by thresholding the gradient image to obtain a binary edge map
[2].We segment the foreground edges corresponding to the
human subject by masking the binary edge image with the sil-
houette image.We then construct a Gaussian distance map E1
of the segmented edge image to determine the edge probability
of a given pixel.The Gaussian distance map,which gives the
proximity of a pixel to the edge,can be obtained by convolving
the binary edge image with a Gaussian kernel and rescaling
the pixel values between0and1[2].In the next step,we
generate a set of hypothesized edge points E2by projecting the
visible boundary of the3D cylindrical model corresponding
to a pose hypothesis to the edge image and sparsely sampling
the points along the boundary.The points that are hidden due
to self occlusion are discarded using the depth information
of the body model.The edge cost is then obtained by com-
puting the mean square error(MSE)of the edge probability
values:
C edge=
1
|E2|
p∈E2
(1?E1(p))2.(13)
Similar to C sil,the value of C edge falls between0and1.
Assuming equal in?uence of edge and silhouette features on
tracking,the?nal likelihood for a given hypothesized pose is
approximated by
p(r t|y t)≈exp(?(C sil+C edge)).(14)
For the case when images from more than one camera view
are available,the silhouette and the edge costs are computed
using images from each camera and the costs are averaged to
obtain the?nal cost of a pose hypothesis in Eq.(14).
B.Particle Filter
The particle?lter is a Monte Carlo approximation to the
sequential Bayesian estimation,which propagates the posterior
probability of the?rst order Markov process from time t?1
to t through the following equation:
p(y t|R t)=c p(r t|y t)
y t?1
p(y t|y t?1)p(y t?1|R t?1),(15)
where c is a constant,y t is the3D pose state at time t;r t
is the image observation;R t=[r1,...,r t]comprises all
observations observed sequentially up to time t;p(y t|y t?1)is
a distribution that describes the motion model;and p(r t|y t)is
the observation likelihood distribution.The multi-dimensional
integral of Eq.(15)can only be evaluated for the simple
case where the posterior distribution of the state variable is
Gaussian.When the state variable corresponds to the human
pose,the posterior distribution is non-Gaussian and methods
like the Kalman?lters generally fail[59].Particle?lters are
therefore often used to approximate Eq.(15)using a set of
weighted samples S t={y(i)t,π(i)t}n i=1where each y(i)t is a
particle,π(i)t is the corresponding particle weight,which is
normalized to ensure iπ(i)t=1and n denotes the number
of particles.The particle?lter does not make any explicit
assumption about the form of the posterior and hence is
applicable to systems where the posterior distribution of the
state variable is non-Gaussian.In order to estimate the pose
using the particle?lter,one must design two models:a dynam-
ical model(a.k.a motion model),namely p(y(i)t|y(i)t?1)which
describes the movement of the human subject from one frame
to another in the3D space;and the observation likelihood
model p(r t|y t)which gives the probability that observation
r t can be generated by a pose sample y t as described in
Section IV-A.At each time step t,given the particle set
S t?1,a basic sequential importance resampling updates the
particles in three steps[60].First,sample n particles from the
discrete distribution denoted by S t?1with replacement.In the
second step,each sampled particle is modi?ed by a motion
model.In the third step,the normalized importance weight is
computed from the observation likelihood.The new particle-
weight set at time t is obtained as S t.The particle-weight
set S t represents the posterior distribution of the state and the
output can be computed by taking the expected value of the
posterior distribution represented by the particle-weight set.
A simple particle?lter does not work accurately in higher
dimensions because a large number of particles is required
to populate such a higher dimensional state 0ca8f1da71fe910ef02df81cing a
small number of particles might lead towards an entrapment
SEDAI et al.:GAUSSIAN PROCESS GUIDED PARTICLE FILTER
4293
Fig.7.Graphical model for AGP-FP method that includes the conditional
p (y t |y t ?1,x t )and the observation likelihood p (r t |y t ).We assume that p (y t |y t ?1,x t )can be factored into the discriminative model p (y t |x t )and the motion model p (y t |y t ?1)according to Eq.(18).
of the particles around local maxima.The occurrence of local maxima in the state space is common in human pose tracking because there are many ways the model can partially ?t the observed image.Variants of the particle ?lter,such as annealed particle ?lter [27]that iteratively pushes the particles towards the high probability regions of the state space,have been developed.Moreover,the computed importance weights might not always be correct,mainly because of the noisy/ambiguous observations and an incorrect generative model.This leads to frequent mistracking.Our Annealed Gaussian Process guided particle ?lter described in the following subsection aims at tackling this problem.
C.Annealed Gaussian Process Guided Particle Filter In Annealed Gaussian Process Guided Particle Filter (AGP-PF),not only the motion information is used,but the discrim-inative distribution obtained from the supervised learning in Section III is also used to track the pose from one frame to another.The graphical model for the AGP-PF is shown in Fig.7.Let y t denote a hidden state that represents the 3D pose at time t and x t be the image observation at time t that is speci?cally used to generate the discriminative distribution p (y t |x t ).Let r t be the image observation at time t that is used to compute the likelihood distribution p (r t |y t ).Let X t =[x 1,...,x t ]and R t =[r 1,...,r t ]be all observations which have been sequentially observed until time t .Then the posterior density of the state after a new observation x t and r t is given by a recursive Bayesian equation
p (y t |R t ,X t )=c p (r t |y t )p (y t |R t ?1,X t ),
(16)
where c =1/p (r t |R t ?1,X t )is a constant term relative to y t and p (r t |y t )=p (r t |y t ,X t )follows from the conditional independence according to the graphical model in Fig.7.Since the integral of the target distribution p (y t |R t ,X t )over the entire space of y t should be one,the target distribution can be calculated by ?rst computing Eq.16without considering the constant c and then normalizing it.Therefore,the constant c does not in?uence the estimation of the target distribution.The conditional independence assumption holds when the features r and x have different properties and they do not depend on each other.In our case,x is taken as the Discrete Cosine Transform of the silhouette and hence it is a much
coarser description of the shape.On the other hand,r repre-sents the edges of the body segments which corresponds to a richer appearance representation of the body parts.As the value of the DCT descriptor of a silhouette gives no knowledge about the value of the edges feature,these two features can be considered to be independent.The prior distribution at time t can be written as
p (y t |R t ?1,X t )=
p (y t |y t ?1,x t )p (y t ?1|R t ?1,X t ?1)d y t ?1.(17)
We assume that the conditional distribution p (y t |y t ?1,x t )can be expressed as a mixture of simpler conditionals [61],i.e.,
p (y t |y t ?1,x t )=(1?α(x t ))p (y t |y t ?1)+α(x t )p (y t |x t ),(18)where p (y t |x t )is a discriminative distribution expressed in terms of a mixture of Gaussians obtained from the trained Gaussian Process models (see Eq.(10));α(x t )∈[01]is a mixing coef?cient that denotes the contribution of the dis-criminative distribution towards the prior at t .We discard any contribution of the discriminative distribution p (y t |x t )when it has a large variance σ2(x t ).This is achieved by relating the mixing coef?cient α(x t )to the variance σ2(x t )using the following equation:
α(x t )=exp ?σ2
(x t )/λ
,(19)where σ2(x t )=trace [ k (x t )]/d is the average variance of
the most likely GP model selected from Eq.(10).
The most likely GP model is the one with the highest gating probability g k (x t ).Figure 9shows the variations of α(x t )w.r.t σ2(x t )given in Eq.(19)for different values of λ.The value of the control parameter λis determined empirically as discussed in Section V-C.1.From Eq.(19),it means that the prediction from the GP model that has a large variance (i.e.,low certainty)is less likely to be correct and thus fewer samples should be drawn from it.From Eqs.(17)and (18),the prior distribution can be expressed as
p (y t |R t ?1,X t )=(1?α(x t ))p (y t |R t ?1,X t ?1)
+α(x t )p (y t |x t ),
(20)
where
p (y t |R t ?1,X t ?1)=
p (y t |y t ?1)p (y t ?1|R t ?1,X t ?1)d y t ?1,
(21)
which is the ?rst term of the prior distribution is obtained by applying the motion model to the posterior distribution at t ?1.The second term of the prior is the discriminative density p (y t |x t )given by Eq.(10).We approximate the posterior at time t ?1by the particle-weight set S t ?1={y (i )t ?1,π(i )t ?1}n i =1and assume a statistically stationary motion model,i.e.,p (y t |y t ?1)~N (y t ?1, ),where is a diagonal covariance matrix.A method to determine the value of is discussed in Section V-C1.We can write the ?rst term of the
prior distribution as p (y t |R t ?1,X t ?1)= i π(i )t ?1N (y (i )
t ?1, ).The second term of the prior distribution is the discriminative
4294IEEE TRANSACTIONS ON IMAGE PROCESSING,VOL.22,NO.11,NOVEMBER 2013
distribution obtained from supervised learning.The ?nal prior of Eq.(20)can then be expressed as
p (y t |R t ?1,X t )=(1?α(x t ))
n i =1
π(i )t ?1N (y (i )
t ?1, )
+α(x t )p (y t |x t ),
(22)
where the discriminative distribution p (y t |x t )is given by
Eq.(10).We use the silhouette and edge based likelihood to model the likelihood distribution p (r t |y t )as described in Section IV-A.The importance weight for each particle can be computed as a normalized likelihood value,i.e.,
π(i )t ∝ p (r t |y (i )t ) A l
,(23)where p (r t |y (i )t )is given by Eq.(14),A l is the inverse
annealing temperature for the l th annealing layer,and π(i )t is
normalized so that n i =1π(i )
t =1.Simulated annealing works on the principle that a region of high probability lies in the vicinity of the particles which have higher weights [2].At the ?rst annealing layer,particles are allowed to ?oat in high energy state.This means that all the particles have similar probabilities associated with them resulting in a smooth and non-peaky likelihood function.Then the system is gradually allowed to cool down at the successive annealing layer.By doing so,peaks in the likelihood function are introduced slowly.Consequently,more particles are concentrated around the high probability region at the end of the annealing layer.We automatically ?nd the value of the annealing temperature A l so that the particle survival rate at each layer is around 50%(following the method of [2]):
ζ(A l )=1n
n
i =1
π(i )
t 2 ?1,(24)
where A l is estimated by minimizing |ζ(A l )?0.5|.This process of decreasing the particles survival rate has the same effect as the cooling of a mechanical system.
The steps of our proposed method are given in Table I.Let S t ,l be the particle weight set output at the end of an l th annealing layer of frame t .For t =0,the initial set of particles S 0is obtained by sampling from the discriminative distribution p (y 0|x 0)and equal importance weights are assigned to them.For t >0,the input to the ?rst annealing layer is the posterior distribution of the previous frame,i.e.,S t ,0=S t ?1.Then,at each annealing layer,the prior distribution is constructed from the discriminative distribution p (y t |x t ),the mixing coef?cient α(x t )and the discrete distribution S t ,l ?1using Eq.(22).For layer l =1,we compute the value of α(x t )according to Eq.(19).For l >1,we force α(x t )=0to prevent any sampling from p (y t |x t )because samples from p (y t |x t )are already taken at l =1.The particles are then sampled from the prior distribution and the importance weights of the particles are computed according to Eq.(23).At the end of each layer,the covariance matrix corresponding to the motion model is shrunk by a factor of 0.5to allow the search in the next layer to be focused on a more narrow region of the pose space.Pose samples obtained from p (y t |x t )need to be aligned if the test images are taken from a camera view which is
TABLE I
A LGORITHM FOR THE A NNEALED G AUSSIAN P ROCESS G UIDED
P ARTICLE F ILTER FOR 3D H UMAN P OSE T
RACKING
different from the camera view of the training images.Let τt be the torso orientation of a sample y t obtained from the discriminative distribution.To align the pose sample y t to the camera view of the test image,we compute τ←τ? where is the camera direction angle difference between the training and test views.The angular difference is determined from the camera calibration matrices of the two camera views.Also,each sample obtained from p (y t |x t )is a pose vector in terms of Euler angles.We convert the pose vector to a motion vector by concatenating the global translation vector at the beginning of the pose vector.The global translation vectors are obtained by sampling from the ?rst term of the prior distribution given in Eq.(22).In this way,all the samples in the particle set are obtained as a motion vector.
It can be seen that the annealed particle ?lter (APF)is a special case of an AGP-PF when α=0.When α=1,the method does not use the motion model;instead,it performs pose detection by sampling from the discriminative distribu-tion alone and validating the samples using the importance weights computed from the likelihood distribution.Hence,the method of Ref.[8]can be seen as a special case of an AGP-PF when α=1.Our method,on the other hand,adaptively chooses the value of αbetween 0and 1as the mixing coef?cient α(x t )in Eq.(20)is inversely related to the prediction uncertainty of the discriminative model.Hence the pose predictions that are likely to be correct are retained to guide the tracking process.This provides a stable tracking since,at each time step,even if the motion model produces
SEDAI et al.:GAUSSIAN PROCESS GUIDED PARTICLE FILTER 4295
wrong samples,the samples obtained from the discriminative distribution are used to compute the posterior.Also,if the predictions from the discriminative model are uncertain,less emphasis is given to them.In such cases,the tracking is more driven by the results of the previous frame.
V.E XPERIMENTS
A.Dataset Description
We trained and evaluated our proposed 3D human pose tracking method using the HumanEva-I data set [62].We used video frames and corresponding 3D poses of the three subjects from the Walking and Jogging sequences of the dataset to train and evaluate our approach.The dataset was originally partitioned into training,validation and testing sets.However,as the ground truth of the testing set was not provided,we used the validation set (which provides ground truth)as our testing set and the original training set as our training set.Table II shows the number of images in the training and testing set for each activity.For each image,the corresponding ground truth 3D motion is given by the 3D pose and the global translation vector.The 3D pose is given by the relative orientations of the 10body parts in terms of Euler angles.We also used the Walking data of subject S2from the HumanEva-II dataset to compare our method with other ones.B.Training
For all the images in the training and testing sets,we extracted the silhouette images using the background subtraction method described in Ref.[50].We computed 64-dimensional DCT shape descriptor vectors from the sil-houettes following the process described in Section III-A.We then trained the mixture of Gaussian Process regressors which map the DCT descriptor to the 3D poses using the supervised learning approach described in Section III-B.We set the number of clusters to K =6so as to allow suf?cient partitioning of the pose space and to model the potential ambiguities associated with the silhouettes as described in Section III-B1.We trained a Gaussian Process regressor for each cluster.The ?nal output of the discriminative model is the mixture of Gaussians given by Eq.(10).C.Tracking and Evaluation
The discriminative model is combined with the motion model to track the pose using our proposed AGP-PF algorithm described in Section IV.We sampled 300particles at each iteration of AGP-PF.The ?nal output of the AGP-PF is a 33-dimensional motion vector whose components are the global translation vector and the relative Euler angles of the ten body parts.Given the length of the body parts,we converted the motion vector to the absolute 3D joint locations using for-ward kinematic.We computed the 3D error between the esti-mated 3D joint locations ˉy and the ground truth 3D joint loca-tions ?y as follows [58]:E (y ,?y )=1J J i =1 m i (ˉy
)?m i (?y ) ,where m i (y )∈R 3denotes the three dimensional coordinates of the i th joint location from the pose vector y ∈R d ; · denotes the Euclidean distance;and J is the number of
joint
Fig.8.The variance of the output from the GP regressor and the mixing coef?cient (α)for each image frame from the Walking sequence of Subject S2.The mixing coef?cient (α)was computed according to Eq.19.
TABLE II
N UMBER OF I MAGES U SED FOR T RAINING AND T ESTING
S ET IN THE H UMAN E VA -I D
ATASET
locations.The formula measures the 3D error in mm between the two pose vectors.The mean 3D error of the T test images is computed by ˉE =1T
T i =1E (ˉy (i ),ˉy (i )).We investigated three cases for pose tracking.The ?rst is an AGP-PF where the value of the mixing coef?cient αis set to be inversely proportional to the variance of the most likely Gaussian component of the discriminative distribution given in Eq.(19).In this case,αtakes values between 0and 1;When the variance of the discriminative distribution is lower,αis set to a higher value.Consequently,more samples are taken from the discriminative distribution during tracking.Conversely,a higher variance of the discriminative distribution sets αto a lower value.As a result,less samples are taken from the discriminative distribution.Since a higher variance of the discriminative distribution implies an uncertain prediction and vice-versa,the discriminative distribution which are more certain are automatically selected for tracking.Figure 8shows an example of the inverse relationship between the variance of the discriminative distribution and the mixing coef?cient αfor each image in the video.It can be seen that the value of αvaries according to the image frames.
In the second case,we set α=1,which suppresses the sequential sampling nature of AGP-PF.The search is only performed in the pose space predicted by GP regression.The search discards the output pose of the preceding frame.In this case,the motion model is only used to predict the global
4296IEEE TRANSACTIONS ON IMAGE PROCESSING,VOL.22,NO.11,NOVEMBER
2013
Fig.9.Mixing coef?cient (α)versus variance σ2for different values of control parameter λ.
translation vectors.We refer to this case as the Annealed Gaussian Process (AGP)method.
In the third case,we set α=0so only the motion model is used to predict the pose.In this case,the discriminative distrib-ution from the Gaussian Process regressors are discarded.This case is equivalent to the annealed particle ?lter (APF).It is to be noted that all three cases use annealing for pose tracking.We also evaluated the performance of a standard particle ?lter (PF)and the GP-PF that does not use annealing for tracking.1)Parameter Selection:There are four parameters that need to be set in the tracking system.The ?rst one is the value of λin Eq.(19).Figure 9shows the relationship of αin terms of σ2for different values of the control parameter λ.We empirically observed that the predictions from MGP regressor whose variance (σ2)is greater than 0.015have larger pose estimation errors.In order to suppress the in?uence of such predictions on tracking,λis set to a value so that αbecomes 0when σ2>0.015.We therefore select λ=0.003so that αbecomes zero for σ2>0.015.
The second parameter is the sampling diagonal covariance matrix of the motion model.The diagonal components of correspond to the sampling variance of each body angle.They are computed so that the standard deviation of each body angle is set to equal the maximum absolute inter frame angular difference for a particular activity [27].
The third free parameter is the number of particles n .We found its optimal value via validation.Figure 10plots the values of n versus the mean 3D errors with respect to the different tracking methods.The mean 3D errors were computed using 150images of a walking human subject.It can be seen that for the case of AGP-PF,APF and GP-PF,the performance did not improve for n >300.Hence we set the optimal value of n =300.Our proposed AGP-PF tracking gave the lowest error for all number of particles.The ?gure also depicts the advantage of annealing by an improved performance of AGP-PF and APF over the performance of GP-PF and PF.
The fourth free parameter is the number of annealing layers M .We set M =5,as the pose tracking performance does not improve for M >5.Table III compares the pose estimation performance of the MGP regressor for different choices
of
Fig.10.Number of particles (n )versus mean 3D pose tracking errors for different pose tracking methods.
TABLE III
C OMPARISON OF 3
D P OS
E E STIMATION E RRORS O
F THE MGP R EGRESSOR FOR D IFFERENT T YPES OF
G ATING F UNCTIONS ON
W ALKING AND J OGGING A
CTIVITIES
TABLE IV
T HE M EAN AND S TANDARD D EVIATION OF 3D T RACKING E RRORS IN mm OF AGP-PF,AGP APF.T HE T RACKING WAS P ERFORMED ON I MAGES C APTURED BY (A )T HREE C AMERAS AND (B )A S INGLE C
AMERA
gating functions on the Walking and Jogging sequences of the dataset.The performance of MGP regressor is superior when the multinomial regressor is used.We therefore incorporate the multinomial logistic regressor in the gating function of our MGP regressor.It should be noted that in order to compute the 3D pose estimation error,the joint locations are measured relative to the torso joint.
2)Experimental Results:Table IV(a)and IV(b)show the mean 3D tracking errors for the three cases for subjects S1,S2and S3for the Walking and Jogging activities.Table IV(a)shows the tracking errors for a single camera whereas Table IV(b)shows the tracking errors for three
SEDAI et al.:GAUSSIAN PROCESS GUIDED PARTICLE FILTER
4297
0ca8f1da71fe910ef02df81cparison of the 3D tracking errors in mm for the Walking sequence evaluated from AGP-PF,APF and AGP pose tracking methods for (a)single camera and (b)multiple
cameras.
Fig.12.Examples of the 3D poses obtained from the GP regression (rendered in blue color)and AGP-PF tracking (rendered in green color).Below each output is the corresponding value of α.Each 3D pose is illustrated using the boundaries of the projected cylinders of the 3D pose (Figure best viewed in color).
cameras.Also included in the tables are the standard deviations of these errors.The results show that our AGP-PF with a dynamic mixing coef?cient gave the lowest mean errors.It also produced the smallest standard deviations denoting that the estimated poses using our method are more stable.It can be seen that the standard particle ?lter with annealing produced a larger error because tracking failed at an early stage.Although the pose detection method produced smaller errors than the APF,the standard deviations of the errors were larger than those from AGP-PF.Moreover,AGP-PF with dynamic mixing coef?cients was able to more accurately track the pose over the frames and recover from mistracking.
Figure 11(a)and (b)show the pose tracking errors of the Walking sequence for all three cases for single and multiple camera tracking.It can be seen that AGP-PF gave the lowest errors for most of the frames and provided more stable pose estimates than the other two cases for both single and multiple camera tracking.AGP-PF performs tracking by giving higher weights to the correct output poses from GP regression
while
Fig.13.(a)An image from the test set.(b)Corresponding silhouette image.(c)&(d)The 3D pose estimates of two GP regressors with the highest gating probability are displayed,with red denoting left limbs and blue denoting right limbs.The associated gating probabilities are displayed below each output pose.These two probable solutions denote the pose ambiguities associated with the silhouette.(e)Our AGP-PF used them as input for 3D pose tracking (Figure best viewed in
color).
0ca8f1da71fe910ef02df81cparison of the ground truth and the estimated knee ?exion angle using our proposed AGP-PF method.
discarding incorrect output poses.Examples of how incorrect output poses from GP regression are given less weights for tracking are shown in Fig.12.In this case,our AGP-PF gives more weights to the poses sampled from the motion model and hence gives correct predictions.
Figure 13shows an example of a multi-modal pose output given by the mixture of GP models.The 3D pose estimates of the two GP regressors with the highest gating probability are shown in Fig.13(c)and (d).These two probable solutions denote the pose ambiguities associated with the silhouette.Our AGP-PF tracking method uses them to predict the correct 3D pose as shown in Fig.13(e).Figure 14compares the knee ?exion angles estimated using our method with the ground truth knee ?exion angles for each video frame.Figures 15and 16display some of the output poses predicted using our method for the Walking and Jogging sequences.The faster and larger arm and leg movements of the human subject in the Jogging sequence makes the pose estimation problem more challenging.Our experiments show that our proposed AGP-PF can effectively track 3D pose in a video sequence.Examples of videos to illustrate the tracking of the 3D pose are available.1
10ca8f1da71fe910ef02df81c.au/?
suman/videos
4298IEEE TRANSACTIONS ON IMAGE PROCESSING,VOL.22,NO.11,NOVEMBER
2013
Fig.15.Pose estimation results on some of the images of the test set of Walking sequence.Each3D pose is illustrated using the boundaries of the projected cylinders of the3D pose(Figure best viewed in color).
TABLE V
C OMPARISON OF M EAN3
D T RACKING
E RRORS(mm)O
F O UR AGP-PF
M ETHOD W ITH O THER A PPROACHES ON THE W ALKING A CTIVITY OF THE H UMAN E VA-I D ATASET.T HE S TANDARD D EVIATION
O VER F RAMES IS N OT P ROVIDED BY
[43]
0ca8f1da71fe910ef02df81cparison with other works
We compared our work with the state of the art tracking methods[4],[32],[34],[58].These approaches use annealed particle?lter[58],smoothing particle?lter[63],particle?lter with physics based prior[34],local optimization[4],a hybrid of local and global optimization[32]and a latent variable model[43]for3D pose tracking.
Table V shows our pose tracking results compared to a smoothing particle?lter of Ref.[63]and a latent variable model of Ref.[43]on the HumanEva-I dataset.The results show that our AGP-PF outperforms the method of Ref.[63]for all the subjects.We found that the mean3D tracking error of our approach is lower than the error of Ref.[43]for Subjects S1and S2.The mean3D tracking error of our approach is higher than the error of Ref.[43]for subject S3.
Table VI shows our pose tracking results compared to those from other state of the art approaches for the Walking sequence of Subject S2of the HumanEva-II dataset.The camera view of the dataset which is used to train the discriminative model is different from the view of Camera C1of
HumanEva-II Fig.16.Pose estimation results on some of the images of the test set of Jogging sequence.Each3D pose is illustrated using the boundaries of the projected cylinders of the3D pose.In comparison with the Walking sequence in Fig.15,because of the larger and faster arm and leg movements,the estimated poses for this sequence are less accurate(Figure best viewed in color).
TABLE VI
3D P OSE T RACKING E RRORS OF O UR AGP-PF M ETHOD AND
S OME S TATE OF A RT P OSE T RACKING A PPROACHES ON THE
H UMAN E VA-II D ATASET.T HE W ALKING A CTIVITY OF
S UBJECT S2IS U SED IN THE E
XPERIMENT
dataset which is used as the test sequence.Therefore we normalize the view(align with respect to the test camera view)of the pose predicted by the discriminative model before tracking.With this pre-processing step,we obtained a mean tracking error of73.0mm which surpasses the performance of Refs.[58]and[4].This shows that our method is not only able to generalize well w.r.t the subjects’gaits and genders,but it is also able to generalize well between camera views.
We also compared our algorithm to the approaches in Refs.[34]and[32]and it turned out that they achieve a better performance(an average error of37mm and53mm respectively).This is understandable because Ref.[34]uses strong priors based on a complex bio-mechanical model of the human body and Ref.[32]uses a local optimization with a surface-based3D human model whereby a person’s body is scanned off-line(using an expensive body scanner).This
SEDAI et al.:GAUSSIAN PROCESS GUIDED PARTICLE FILTER4299
phase requires the full cooperation of the person which is not required in our case.
VI.C ONCLUSION
We have presented an Annealed Gaussian Process Guided Particle Filter(AGP-PF)for3D human pose tracking in video sequences.Our method effectively exploits the discriminative distribution obtained from the mixture of Gaussian Process regression model with a motion model to obtain an accurate and stable tracking of the3D pose in each video frame.We use the prediction uncertainty obtained from the Gaussian Process regression to dynamically determine the contribution of the discriminative model for tracking.Our method does not require initialization and can resolve pose ambiguities during tracking using motion information and multi-view images.Experimental results show that our proposed AGP-PF can accurately track the3D pose in a long video sequence. Although this paper uses a stationary motion model for pose tracking,we believe that by using a learned motion model, the pose can be tracked more accurately.Tracking in a lower dimensional latent space could also be employed for a more ef?cient performance.Moreover,the scalability of our method could be improved by training the discriminative model using data from various other activities.
R EFERENCES
[1] C.Sminchisescu and B.Triggs,“Estimating articulated human motion
with covariance scaled sampling,”Int.J.Robot.Res.,vol.22,no.6, pp.371–391,Jun.2003.
[2]J.Deutscher,A.Blake,and I.Reid,“Articulated body motion capture by
annealed particle?ltering,”in Proc.IEEE Conf.CVPR,vol.2.Jun.2000, pp.126–133.
[3]M.Isard and A.Blake,“Condensation—Conditional density propagation
for visual tracking,”0ca8f1da71fe910ef02df81cput.Vis.,vol.29,no.1,pp.5–28, Aug.1998.
[4]S.Corazza,L.Mündermann, E.Gambaretto,G.Ferrigno,and
T.P.Andriacchi,“Markerless motion capture through visual hull, articulated ICP and subject speci?c model generation,”0ca8f1da71fe910ef02df81cput.
Vis.,vol.87,nos.1–2,pp.156–169,Mar.2010.
[5] A.Agarwal and B.Triggs,“Monocular human motion capture with
a mixture of regressors,”in 0ca8f1da71fe910ef02df81cput.Soc.Conf.CVPR,vol.3.
Jun.2005,p.72.
[6]G.Shakhnarovich,P.Viola,and T.Darrell,“Fast pose estimation with
parameter-sensitive hashing,”in 0ca8f1da71fe910ef02df81cput.Vis.,vol.2.
Oct.2003,pp.750–757.
[7] A.Kanaujia and D.Metaxas,“Learning ambiguities using Bayesian
mixture of experts,”in Proc.18th IEEE Int.Conf.Tools Artif.Intell., Nov.2006,pp.436–440.
[8]R.Rosales and S.Sclaroff,“Combining generative and discriminative
models in a framework for articulated pose estimation,”0ca8f1da71fe910ef02df81cput.
Vis.,vol.67,no.3,pp.251–276,May2006.
[9]L.Sigal, A.Balan,and M.Black,“Combined discriminative and
generative articulated pose and non-rigid shape estimation,”in Advances in Neural Information Processing Systems.Cambridge,MA,USA: MIT Press,2008,pp.1337–1344.
[10] A.Agarwal and B.Triggs,“Recovering3D human pose from monocular
images,”IEEE Trans.Pattern Anal.Mach.Intell.,vol.28,no.1, pp.44–58,Jun.2006.
[11] C.Sminchisescu,A.Kanaujia,Z.Li,and D.Metaxas,“Discriminative
density propagation for3D human motion estimation,”in Proc.IEEE Conf.CVPR,vol.1.Jun.2005,pp.390–397.
[12]R.Urtasun and T.Darrell,“Sparse probabilistic regression for activity-
independent human pose inference,”in Proc.IEEE Conf.CVPR, Jun.2008,pp.1–8.
[13]X.Zhao,Y.Fu,and Y.Liu,“Human motion tracking by temporal-spatial
local Gaussian process experts,”IEEE Trans.Image Process.,vol.20, no.4,pp.1141–1151,Apr.2011.[14]L.Bo,C.Sminchisescu,A.Kanaujia,and D.Metaxas,“Fast algorithms
for large scale conditional3D prediction,”in Proc.IEEE Conf.CVPR, Jun.2008,pp.1–8.
[15] A.Kanaujia,C.Sminchisescu,and D.N.Metaxas,“Semi-supervised
hierarchical models for3D human pose reconstruction,”in Proc.IEEE Conf.CVPR,Jun.2007,pp.1–8.
[16]H.Ning,W.Xu,Y.Gong,and T.Huang,“Discriminative learning of
visual words for3D human pose estimation,”in Proc.IEEE Conf.CVPR, Jun.2008,pp.1–8.
[17] A.Bissacco,M.-H.Yang,and S.Soatto,“Fast human pose estimation
using appearance and motion via multi-dimensional boosting regres-sion,”in Proc.IEEE Conf.CVPR,Jun.2007,pp.1–8.
[18]J.Shotton,A.Fitzgibbon,M.Cook,T.Sharp,M.Finocchio,R.Moore,
A.Kipman,and A.Blake,“Real-time human pose recognition in parts
from single depth images,”in Proc.CVPR,2011.
[19] A.Agarwal and B.Triggs,“A local basis representation for estimat-
ing human pose from cluttered images,”in Proc.ACCV,Jan.2006, pp.50–59.
[20]S.Sedai,M.Bennamoun,and D.Q.Huynh,“Context-based appearance
descriptor for3D human pose estimation from monocular images,”
in Proc.DICTA,Dec.2009,pp.484–491.
[21]S.Sedai,M.Bennamoun,and D.Q.Huynh,“Localized fusion of shape
and appearance features for3D human pose estimation,”in Proc.BMVC, Sep.2010,pp.1–10.
[22]S.Sedai,M.Bennamoun,and D.Q.Huynh,“Discriminative fusion
of shape and appearance features for human pose estimation,”Pattern Recognit.,to be published.
[23] A.Fossati,M.Salzmann,and P.Fua,“Observable subspaces for
3D human motion recovery,”in Proc.IEEE CVPR,Jun.2009, pp.1137–1144.
[24] C.Ionescu,L.Bo,and C.Sminchisescu,“Structural SVM for visual
localization and continuous state estimation,”in Proc.12th Int.Conf.
Comput.Vis.,Oct.2009,pp.1157–1164.
[25]L.Bo and C.Sminchisescu,“Twin Gaussian processes for structured
prediction,”0ca8f1da71fe910ef02df81cput.Vis.,vol.87,nos.1–2,pp.28–52,Mar.2010.
[26]R.Navaratnam, A.Fitzgibbon,and R.Cipolla,“The joint manifold
model for semi-supervised multi-valued regression,”in Proc.IEEE11th CCCV,Oct.2007,pp.1–8.
[27]J.Deutscher and I. D.Reid,“Articulated body motion capture by
stochastic search,”0ca8f1da71fe910ef02df81cput.Vis.,vol.61,no.2,pp.185–205, Feb.2005.
[28] C.Bregler and J.Malik,“Tracking people with twists and exponential
maps,”in Proc.IEEE Comput.Soc.Conf.CVPR,Jun.1998,pp.8–15.
[29]M.W.Lee and I.Cohen,“A model-based approach for estimating human
3D poses in static images,”IEEE Trans.Pattern Anal.Mach.Intell., vol.28,no.6,pp.905–916,Jun.2006.
[30] D.Ramanan,D.Forsyth,and A.Zisserman,“Strike a pose:Tracking
people by?nding stylized poses,”in Proc.IEEE Comput.Soc.Conf.
CVPR,vol.1.Jun.2005,pp.271–278.
[31]J.Li and N.M.Allinson,“A comprehensive review of current local
features for computer vision,”Neurocomput.,vol.71,nos.10–12, pp.1771–1787,Jun.2008.
[32]J.Gall,B.Rosenhahn,T.Brox,and H.-P.Seidel,“Optimization and
?ltering for human motion capture,”0ca8f1da71fe910ef02df81cput.Vis.,vol.87, nos.1–2,pp.75–92,2010.
[33]H.Sidenbladh,M.J.Black,and D.J.Fleet,“Stochastic tracking of3D
human?gures using2D image motion,”in Proc.ECCV,vol.2.2000, pp.702–718.
[34]M.A.Brubaker,D.J.Fleet,and A.Hertzmann,“Physics-based person
tracking using the anthropomorphic walker,”0ca8f1da71fe910ef02df81cput.Vis.,vol.87, nos.1–2,pp.140–155,Mar.2010.
[35]H.Sidenbladh and M.J.Black,“Learning the statistics of people
in images and video,”0ca8f1da71fe910ef02df81cput.Vision,vol.54,nos.1–3, pp.181–207,2003.
[36]J.Deutscher,A.J.Davison,and I.D.Reid,“Automatic partitioning
of high dimensional search spaces associated with articulated body motion capture,”in Proc.IEEE Comput.Soc.Conf.CVPR,vol.2.2001, pp.II-669–II-676.
[37] A.Sundaresan and R.Chellappa,“Multicamera tracking of articulated
human motion using shape and motion cues,”IEEE Trans.Img.Process., vol.18,no.9,pp.2114–2126,Sep.2009.
[38]R.Navaratnam,A.Thayananthan,P.H.Torr,and R.Cipolla,“Hierarchi-
cal part-based human body pose estimation,”in Proc.BMVC,Sep.2005.
[39] D.Ramanan, D.Forsyth,and A.Zisserman,“Tracking people by
learning their appearance,”IEEE Trans.Pattern Anal.Mach.Intell., vol.29,no.1,pp.65–81,Jan.2007.
4300IEEE TRANSACTIONS ON IMAGE PROCESSING,VOL.22,NO.11,NOVEMBER2013
[40]R.Li,T.-P.Tian,S.Sclaroff,and M.-H.Yang,“3D human motion
tracking with a coordinated mixture of factor analyzers,”0ca8f1da71fe910ef02df81cput.
Vis.,vol.87,nos.1–2,pp.170–190,Mar.2010.
[41]G.W.Taylor and G.E.Hinton,“Factored conditional restricted boltz-
mann machines for modeling motion style,”in Proc.26th Annu.Int.
Conf.Mach.Learn.,Jun.2009,pp.1025–1032.
[42]R.Urtasun,D.J.Fleet,and P.Fua,“3D people tracking with Gaussian
process dynamical models,”in Proc.IEEE Comput.Soc.Conf.CVPR, vol.1.Jun.2006,pp.238–245.
[43] A.Yao,J.Gall,L.J.V.Gool,and R.Urtasun,“Learning probabilis-
tic non-linear latent variable models for tracking complex activities,”
in Proc.ANIPS,2011,pp.1359–1367.
[44]J.Wang,D.Fleet,and A.Hertzmann,“Gaussian process dynamical
models for human motion,”EEE Trans.Pattern Recognit.Mach.Intell., vol.30,no.2,pp.283–298,Feb.2008.
[45] C.Sminchisescu,A.Kanaujia,and D.Metaxas,“Learning joint top-
down and bottom-up processes for3D visual inference,”in Proc.IEEE Comput.Soc.Conf.CVPR,vol.2.Jun.2006,pp.1743–1752.
[46]M.Salzmann and R.Urtasun,“Combining discriminative and generative
methods for3D deformable surface and articulated pose reconstruction,”
in Proc.IEEE Conf.CVPR,Jun.2010,pp.647–654.
[47]S.Sedai,D.Q.Huynh,and M.Bennamoun,“Supervised particle?lter
for tracking2D human pose in monocular video,”in Proc.IEEE WACV, Jan.2011,pp.367–373.
[48]J.Ko and D.Fox,“Gp-BayesFilters:Bayesian?ltering using Gaussian
process prediction and observation models,”Auto.Robot.,vol.27,no.1, pp.75–90,2009.
[49]I.Patras and E.R.Hancock,“Coupled prediction classi?cation for robust
visual tracking,”IEEE Trans.Pattern Anal.Mach.Intell.,vol.32,no.9, pp.1553–1567,Sep.2010.
[50] A.Elgammal,R.Duraiswami,D.Harwood,and L.Davis,“Background
and foreground modeling using nonparametric kernel density estimation for visual surveillance,”Proc.IEEE,vol.90,no.7,pp.1151–1163, Jul.2002.
[51]S.Sedai,M.Bennamoun,and D.Q.Huynh,“Evaluating shape and
appearance descriptors for3D human pose estimation,”in Proc.6th IEEE ICIEA,Jun.2011,pp.293–298.
[52]P.Tresadern and I.Reid,“An evaluation of shape descriptors for image
retrieval in human pose estimation,”in Proc.BMVC,vol.2.Sep.2007, pp.800–809.
[53] C.E.Rasmussen and C.K.I.Williams,Gaussian Processes for Machine
Learning(Adaptive Computation and Machine Learning).Cambridge, MA,USA:MIT Press,2005.
[54]V.Tresp,“Mixtures of gaussian processes,”in Advances in Neural
Information Processing Systems.Cambridge,MA,USA:MIT Press, 2001,pp.654–660.
[55] B.Krishnapuram,L.Carin,M.A.T.Figueiredo,and A.J.Hartemink,
“Sparse multinomial logistic regression:Fast algorithms and generaliza-tion bounds,”IEEE Trans.Pattern Anal.Mach.Intell.,vol.27,no.6, pp.957–968,Jun.2005.
[56] C.H.Ek,J.Rihan,P.H.Torr,G.Rogez,and N. 0ca8f1da71fe910ef02df81cwrence,
“Ambiguity modeling in latent spaces,”in Proc.5th Int.Workshop Mach.
Learn.Multimodal Int.,2008,pp.62–73.
[57] C.Sminchisescu and A.Telea,“Human pose estimation from silhouettes
a consistent approach using distance level sets,”in Proc.Int.Conf.
Comput.Graph.,0ca8f1da71fe910ef02df81cput.Vis.,2002,pp.413–420.
[58]L.Sigal,A.Balan,and M.Black,“HumanEva:Synchronized video
and motion capture dataset and baseline algorithm for evaluation of articulated human motion,”0ca8f1da71fe910ef02df81cput.Vis.,vol.87,nos.1–2, pp.4–27,Mar.2010.
[59]J.Deutscher,B.North,B.Bascle,and A.Blake,“Tracking through
singularities and discontinuities by random sampling,”in Proc.7th IEEE 0ca8f1da71fe910ef02df81cput.Vis.,vol.2.Sep.1999,pp.1144–1149.
[60] A.Doucet,N.De Freitas,and N.Gordon,Sequential Monte Carlo
Methods in Practice.New York,NY,USA:Springer-Verlag,2001. [61] A.Pfeffer,“Suf?ciency,separability and temporal probabilistic models,”
in Proc.7th Conf.UAI,2001,pp.421–428.
[62]L.Sigal and M.J.Black,“HumanEva:Synchronized video and motion
capture dataset for evaluation of articulated human motion,”Dept.
Comput.Sci.,Brown Univ.,Providence,RI,USA,Tech.Rep.CS-06-08, 2006.
[63]P.Peursum,S.Venkatesh,and G.West,“A study on smoothing for
particle-?ltered3D human body tracking,”0ca8f1da71fe910ef02df81cput.Vis.,vol.87, nos.1–2,pp.53–74,Mar.
2010.
Suman Sedai received the M.Sc.degree from Inha
University,Incheon,Korea,and the Ph.D.degree
from the University of Western Australia,Crawley,
Australia,in2007and2012,respectively.His current
research interests include image processing,visual
tracking,object recognition,pattern recognition,and
machine learning.He is currently a Post-Doctoral
Researcher with IBM Research,Melbourne,Aus-
tralia.
Mohammed Bennamoun received the M.Sc.degree
in control theory from Queen’s University,Kingston,
ON,Canada,and the Ph.D.degree in computer
vision from Queen’s/Q.U.T,Brisbane,Australia.He
was a lecturer in robotics with Queen’s University
and joined QUT in1993as an Associate Lecturer.
He is currently a Winthrop Professor.He served as
the Head of the School of Computer Science and
Software Engineering,The University of Western
Australia,Crawley,Australia,from2007to2012.
He served as the Director of the University Centre at QUT:The Space Centre for Satellite Navigation from1998to2002.He was an Erasmus Mundus Scholar and a Visiting Professor with the University of Edinburgh,Edinburgh,U.K.,in2006.He was a Visiting Professor with the Centre National de la Recherche Scienti?que and Telecom Lille1,France, in2009,Helsinki University of Technology,Helsinki,France,in2006, and University of Bourgogne and Paris13,Paris,France,from2002to 2003.He is the co-author of Object Recognition:Fundamentals and Case Studies(Springer-Verlag,2001)and the co-author of an edited book on Ontology Learning and Knowledge Discovery Using the Web in2011.He has published over250journal and conference publications and secured highly competitive national grants from the Australian Research Council(ARC). Some of these grants were in collaboration with Industry partners(through the ARC Linkage Project scheme)to solve real research problems for industry, including Swimming Australia,the West Australian Institute of Sport,a textile company(Beaulieu Paci?c),and AAM-GeoScan.He has worked on research problems and collaborated(through joint publications,grants,and supervision of Ph.D.students)with researchers from different disciplines, including animal biology,speech processing,biomechanics,ophthalmology, dentistry,linguistics,robotics,photogrammetry,and radiology.He received the Best Supervisor of the Year Award from QUT.He received an award for research supervision from UWA in2008.He served as a Guest Editor for a couple of special issues in international journals,such as the International Journal of Pattern Recognition and Arti?cial Intelligence.He was selected to give conference tutorials from the European Conference on Computer Vision and the International Conference on Acoustics Speech and Signal Processing. He has organized several special sessions for conferences,including a special session for the IEEE International Conference in Image Processing.He was on the program committee of many international conferences.He has contributed in the organization of many local and international conferences.His current research interests include control theory,robotics,obstacle avoidance,object recognition,arti?cial neural networks,signal/image processing,and computer vision(particularly
3D).
Du Q.Huynh is an Associate Professor with the
School of Computer Science and Software Engi-
neering,University of Western Australia,Crawley,
Australia.She received the Ph.D.degree in computer
vision from the University of Western Australia
in1994.She was with the Australian Coopera-
tive Research Centre for Sensor Signal and Infor-
mation Processing,Murdoch University,Murdoch,
Australia.She has been a Visiting Scholar with
Lund University,Lund,Sweden,Malm?University,
Malm?,Sweden,Chinese University of Hong Kong, Hong Kong,Nagoya University,Nagoya,China,Gunma University,Gunma, Japan,and the University of Melbourne,Melbourne,Australia.She has received several grants funded by the Australian Research Council.Her current research interests include shape from motion,multiple view geometry,video image processing,and visual tracking.
正在阅读:
a gaussian process guide particle filter for tracking 3D human pose in video05-03
C语言程序设计考试题库10-05
滤池布气试验方案04-03
20150105基站新型智能动环监控单元(V1.0)06-26
时事小评论02-19
2017高高考地理综合题朗诵材料03-08
按摩师技能操作流程大字009-16
- 教学能力大赛决赛获奖-教学实施报告-(完整图文版)
- 互联网+数据中心行业分析报告
- 2017上海杨浦区高三一模数学试题及答案
- 招商部差旅接待管理制度(4-25)
- 学生游玩安全注意事项
- 学生信息管理系统(文档模板供参考)
- 叉车门架有限元分析及系统设计
- 2014帮助残疾人志愿者服务情况记录
- 叶绿体中色素的提取和分离实验
- 中国食物成分表2020年最新权威完整改进版
- 推动国土资源领域生态文明建设
- 给水管道冲洗和消毒记录
- 计算机软件专业自我评价
- 高中数学必修1-5知识点归纳
- 2018-2022年中国第五代移动通信技术(5G)产业深度分析及发展前景研究报告发展趋势(目录)
- 生产车间巡查制度
- 2018版中国光热发电行业深度研究报告目录
- (通用)2019年中考数学总复习 第一章 第四节 数的开方与二次根式课件
- 2017_2018学年高中语文第二单元第4课说数课件粤教版
- 上市新药Lumateperone(卢美哌隆)合成检索总结报告
- gaussian
- particle
- tracking
- process
- filter
- guide
- human
- video
- pose
- 3D
- 安全文明施工标准化工地施工方案
- 2019年清华大学新闻与传播学院440新闻与传播专业基础之传播学教程考研核心题库
- P2P网站第三方支付起着什么作用 表现在哪些方面
- 2018年汕头大学高教所633教育学综合之教育学考研核心题库
- 2021最新教师研修个人心得总结5篇
- 离心式鼓风机项目可行性研究报告(2015年版)
- 责任护士考核规范标准
- 【最新】成本主管个人简历模板-实用word文档 (4页)
- 企业战略管理——娃哈哈集团多元化战略
- 山东英才学院2021年普通专升本统一考试学前教育学论述题题库
- 华文版(新版)五年级下册书法教案
- 视听语言选择题部分一
- 2017年度合肥市中小学教师全员培训项目
- 2011广西壮族自治区农村信用社考试历年最新考试试题库(完整版)
- 精华经典版122页高考数学知识点总结及高中数学解题思想方法全部内容精华版
- 《儒林外史》试题含答案解析-儒林外史题目及答案解析
- dm广告杂志《悠游yoyo》招商策划书
- 2018年中国人民大学劳动人事学院620学科基础之管理学考研仿真模拟五套题
- 麦克奥迪实业集团有限公司诉专利复审委,第三人钱涵清专利无效行政纠纷
- 北师大版2018-2019学年度七年级下期中考试数学试题(有详细答案)