Research
My expertise lies in computer vision, machine learning, and natural language processing, with a focus on creating efficient AI solutions for large-scale data analysis. My work, which includes projects in person re-identification, multi-camera tracking, and deep learning, aims to develop advanced algorithms for applications such as video surveillance and image recognition. This research not only contributes to the academic field but also has practical applications in various industries, including security and media. |
|
IGMG: Instance-guided multi-granularity for domain generalizable person
re-identification
Amran Bhuiyan,
Jimmy X. Huang,
Aijun An
Computer Vision and Image Understanding (CVIU), 2023
The effectiveness of multi-granularity methods in addressing domain shift issues in person re-identification is investigated, and a novel framework called Instance-Guided Multigranularity (IGMG) is introduced, which employs non-parametric Instance Normalization (IN) for processing style-free features at different levels of detail, thereby improving the model's generalization through the utilization of shareable low and mid-level features.
|
|
A Systematic Study and Comprehensive Evaluation of ChatGPT on
Benchmark Datasets
Md Tahmid Rahman Laskar,
M Saiful Bari,
Mizanur Rahman,
Amran Bhuiyan,
Shafiq Joty,
Jimmy X. Huang
Association for Computational Linguistics (ACL' 23 Findings), 2023
This paper conducts an extensive assessment of ChatGPT's performance across a wide range of academic datasets including tasks such as question answering, text summarization, code generation, commonsense reasoning, mathematical problem-solving, machine translation, bias detection, and ethical considerations. The evaluation covers 140 tasks and analyzes 255K responses, marking the most comprehensive evaluation of ChatGPT within NLP benchmarks. Ultimately, this study aims to evaluate ChatGPT's strengths and weaknesses across diverse tasks and offer valuable insights for future research employing Large Language Models (LLMs).
|
|
System and Method for Identity Preservative Representation of Persons and Objects Using Spatial and Appearance Attributes
Mehrsan Javan Roshtkhari ,
Amran Bhuiyan,
Yang Liu ,
Parthipan Siva ,
Eric Granger,
Ismail Ben Ayed
US Patent , 2022
The method involves processing images to create unique identity-preserving descriptors for individuals or objects by extracting and combining spatial and appearance attributes, and then using these descriptors to differentiate and compare various entities using a predefined mathematical distance metric. |
|
STCA: Utilizing a spatio-temporal cross-attention network for enhancing video person re-identification
Amran Bhuiyan,
Jimmy X. Huang
Image and Vision Computing (IVC), 2022
We propose a Spatio Temporal Cross Attention (STCA) network to generate cross guided attention for video re-identification that leverages 2D and 3D-CNNs to identify salient features in videos, enhancing recognition accuracy through attention-based gating and optimized by cosine distance for efficient and precise video re-identification.
|
|
Exploiting prunability for person re-identification
Hugo Masson * ,
Amran Bhuiyan * ,
Le Thanh nguyen-meidine * ,
Mehrsan Javan Roshtkhari ,
Parthipan Siva ,
Ismail Ben Ayed,
Eric Granger
(* Equal Contribution)
EURASIP Journal on Image and Video Processing , 2021
In this paper, we investigate the prunability of the different CNN architectures under different design scenarios. This paper first revisits pruning techniques that are suitable for reducing the computational complexity of deep CNN networks applied to person re-identification. Then, these techniques are analyzed according to their pruning criteria and strategy and according to different scenarios for exploiting pruning methods to fine-tuning networks to target domains. |
|
Flow guided mutual attention for person re-identification
Madhu Kiran ,
Amran Bhuiyan ,
Le Thanh nguyen-meidine * ,
Mehrsan Javan Roshtkhari ,
Louis-Antoine Blais-Morin
Ismail Ben Ayed,
Eric Granger
Image and Video Computing , 2021
In this paper, the motion pattern of a person is explored as an additional cue for ReID. In particular, a flow-guided Mutual Attention network is proposed for fusion of bounding box and optical flow sequences
over tracklets using any 2D-CNN backbone, allowing to encode temporal information along with spatial appearance information. Our Mutual Attention network relies on the joint spatial attention between image and optical
flow feature maps to activate a common set of salient features. |
|
Pose Guided Gated Fusion for Person Re-identification
Amran Bhuiyan ,
Yang Liu ,
Le Thanh nguyen-meidine * ,
Parthipan Siva ,
Mehrsan Javan Roshtkhari ,
Ismail Ben Ayed,
Eric Granger
Winter Conference on Applications of Computer Vision (WACV) , 2020
In this paper, a new deep learning model is proposed for pose-guided re-identification, comprised of a deep back- bone, pose estimation, and gated fusion network. Given a query image of an individual, the backbone convolutional NN produces a feature embedding required for pair-wise matching with embeddings for reference images, where fea- ture maps from the pose network and from mid-level CNN layers are combined by the gated fusion network to gen- erate pose-guided gating. The proposed framework al- lows to dynamically activate the most discriminant CNN filters based on pose information in order to perform a finer grained recognition.
|
|
Unsupervised Domain Adaptation in the Dissimilarity Space for Person Re-identification
Djebril Mekhazni,
Amran Bhuiyan ,
George S. Eskander Ekladious,
Eric Granger
European Conference on Computer Vision (ECCV) , 2020
In this paper, we propose a novel Dissimilarity-based Maximum Mean Discrepancy (D-MMD) loss for aligning pair-wise distances that can be optimized via gradient descent using relatively small batch sizes. From a person ReID perspective, the evaluation of D-MMD loss is straightforward since the tracklet information (provided by a person tracker) allows to label a distance vector as being either within-class (within-tracklet) or between-class (between-tracklet). This allows approximating the underlying distribution of target pair-wise distances for D-MMD loss optimization, and accordingly align source and target distance distributions. Empirical results with three challenging benchmark datasets show that the proposed D-MMD loss decreases as source and domain distributions become more similar.
|
|
Unsupervised Domain Adaptation in the Dissimilarity Space for Person Re-identification
Rameswar Panda,
Amran Bhuiyan ,
Vittorio Murino ,
Amit K. Roy-Chowdhury
Pattern Recognition (PR) , 2019.
This paper extends our CVPR 2017 paper providing a new source-target selective adaptation strategy and rigorous experiments on more person re-id datasets.
|
|
RGB-Depth Cross-Modal Person Re-identification
Frank M. Hafner ,
Amran Bhuiyan ,
Julian F. P. Kooij ,
Eric Granger
IEEE Conference on Advanced Video and Signal Based Surveillance (AVSS), 2019.
We develop a novel cross-modal distillation network for robust
person re-identification, which learns a shared feature representation space of person’s appearance in both RGB and
depth images.
|
|
Unsupervised Adaptive Re-identification in Open World Dynamic Camera Networks
Rameswar Panda *,
Amran Bhuiyan* ,
Vittorio Murino ,
Amit K. Roy-Chowdhury
(* Equal Contribution)
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
We propose an unsupervised adaptation scheme for re-identification models where a new camera may be temporarily inserted into an existing system to get additional information.
|
|
Exploiting Gaussian Mixture Importance for Person Re-identification
Xiangping Zhu ,
Amran Bhuiyan ,
Mohamed Lamine Mekhalfi,
Vittorio Murino
14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), 2017.
We propose a Gaussian Mixture Importance Estimation (GMIE) approach for ReID, which exploits the Gaussian Mixture Models (GMMs) to estimate the observed commonalities of similar and dissimilar person pairs in the feature space. |
|
Person re-identification using sparse representation with manifold constraints
Behzad Mirmahboub,
Hamed Kiani ,
Amran Bhuiyan ,
Alessandro Perina ,
Baochang zhang ,
Alessio Del Bue ,
Vittorio Murino
IEEE International Conference on Image Processing, 2016.
In this paper, we propose a novel framework that combines sparse coding and manifold constraints to extract discriminative information from multi-shot images of one pedestrian for person re-identification across a set of non-overlapped surveillance cameras.
|
|
Exploiting multiple detections to learn robust brightness transfer functions in re-identification systems
Amran Bhuiyan ,
Alessandro Perina ,
Vittorio Murino
IEEE International Conference on Image Processing, 2015.
This paper proposes the use of Cumulative Weighted Brightness Transfer Functions to model this appearance variations. It is multiple frame-based learning approach which leverages consecutive detections of each individual to transfer the appearance, rather than learning brightness transfer function from pairs of images.
|
|
Person Re-identification Using Robust Brightness Transfer Functions Based on Multiple Detections
Amran Bhuiyan ,
Behzad Mirmahboub,
Alessandro Perina ,
Vittorio Murino
International Conference on Image Analysis and Processing , 2015.
This paper proposes the use of Minimum Multiple Cumulative Brightness Transfer Functions to model this appearance variations. It is multiple frame-based learning approach which leverages consecutive detections of each individual to transfer the appearance, rather than learning brightness transfer function from pairs of images.
|
|
Person re-identification by discriminatively selecting parts and features
Amran Bhuiyan ,
Alessandro Perina ,
Vittorio Murino
European Conference on Computer Vision (ECCV)- Workshop on Visual Surveillance and Re-Identification (VS-Re-ID), 2014.
Winner of the INTEL Best paper award
This paper presents a novel appearance-based method for person re-identification. The core idea is to rank and select different body parts on the basis of the discriminating power of their characteristic features. In our approach, we first segment the pedestrian images into meaningful parts, then we extract features from such parts as well as from the whole body and finally, we perform a salience analysis based on regression coefficients.
|
|