| 
            
              | 
                  Md Amran Hossen Bhuiyan
                 
			I am currently a Postdoctoral Researcher in the Information Retrieval and Knowledge Management Lab (IRLAB) at York University, led by Professor Jimmy Huang, where I contribute to AI research at the intersection of computer vision and contextual information retrieval, with a focus on multimodal settings and improving model generalization.
                 
                 
I earned my Ph.D. from the Istituto Italiano di Tecnologia in Genova, Italy, where I was supervised by Prof. Vittorio Murino. During my doctoral studies, I spent six months as a visiting research scientist at the Video Computing Group at the University of California, Riverside, guided by  Amit K. Roy-Chowdhuryin 2016. Prior to my Ph.D., I completed my Master's degree at the Lucian Blaga University of Sibiu  in Romania, as a recipient of the Erasmus Mundus External Window scholarship program.
                 
                  Email  / 
                  CV  / 
                  Scholar  / 
                  Github  / 
				  LinkdIn 
                 |   |  
              
              | Research
                  My expertise lies in computer vision, machine learning, and natural language processing, with a focus on creating efficient AI solutions for large-scale data analysis. My work, which includes projects in person re-identification, multi-camera tracking, and deep learning, aims to develop advanced algorithms for applications such as video surveillance and image recognition. This research not only contributes to the academic field but also has practical applications in various industries, including security and media.             |  
            
          
          
          
    
      |  | Optimizing domain-generalizable ReID through non-parametric normalization Amran Bhuiyan,
         Jimmy X. Huang,
           Aijun An
	     Jialie Shen
 Pattern Recognition, 2025
 
 Optimizing deep neural networks to generalize effectively across diverse visual domains remains a key challenge in computer vision, especially in domain-generalizable person re-identification (ReID). The goal of domain-generalizable ReID is to develop robust deep learning (DL) models that are effective across both known (source) and unseen (target) domains. However, many top-performing ReID methods overfit to the source domain, impairing their generalization ability. Previous approaches have employed Instance Normalization (IN) with learnable parameters to generalize domains and eliminate source domain styles. Recently, some DL frameworks have adopted normalization techniques without learnable parameters. We critically examine non-parametric normalization techniques for optimizing the deep ReID model, emphasizing the advantages of using non-parametric instance normalization as a gating mechanism to extract style-independent features at various abstraction levels within both convolutional neural networks (CNNs) and Vision Transformers (ViT). Our framework offers strategic guidance on the optimal placement of non-parametric IN within the network architecture to ensure effective information flow management in subsequent layers. Additionally, we employ one-dimensional Batch Normalization (BN) without learnable parameters at deeper network levels to remove content-related biases from the source domain. Our integrated approach, termed DualNormNP, systematically optimizes the model’s capacity to generalize across varied domains. Comprehensive evaluations on multiple benchmark ReID datasets demonstrate that our approach surpasses current state-of-the-art ReID methods in terms of generalization performance.
 |  
      |  | IGMG: Instance-guided multi-granularity for domain generalizable person
            re-identification Amran Bhuiyan,
         Jimmy X. Huang,
           Aijun An
 Computer Vision and Image Understanding (CVIU), 2024
 
 The effectiveness of multi-granularity methods in addressing domain shift issues in person re-identification is investigated, and a novel framework called Instance-Guided Multigranularity (IGMG) is introduced, which employs non-parametric Instance Normalization (IN) for processing style-free features at different levels of detail, thereby improving the model's generalization through the utilization of shareable low and mid-level features.
 |  
      |  | A Systematic Study and Comprehensive Evaluation of ChatGPT on
            Benchmark Datasets Md Tahmid Rahman Laskar,
         M Saiful Bari,
         Mizanur Rahman,
		Amran Bhuiyan,
     Shafiq Joty,
         Jimmy X. Huang
 Association for Computational Linguistics  (ACL' 23 Findings), 2023
 
 This paper conducts an extensive assessment of ChatGPT's performance across a wide range of academic datasets including tasks such as question answering, text summarization, code generation, commonsense reasoning, mathematical problem-solving, machine translation, bias detection, and ethical considerations. The evaluation covers 140 tasks and analyzes 255K responses, marking the most comprehensive evaluation of ChatGPT within NLP benchmarks. Ultimately, this study aims to evaluate ChatGPT's strengths and weaknesses across diverse tasks and offer valuable insights for future research employing Large Language Models (LLMs).
 |  
      |  | System and Method for Identity Preservative Representation of Persons and Objects Using Spatial and Appearance Attributes Mehrsan Javan Roshtkhari ,
       
		Amran Bhuiyan,
     Yang Liu ,
     Parthipan Siva ,
     Eric Granger,
         Ismail Ben Ayed
 US Patent , 2022
 
 The method involves processing images to create unique identity-preserving descriptors for individuals or objects by extracting and combining spatial and appearance attributes, and then using these descriptors to differentiate and compare various entities using a predefined mathematical distance metric.
 |  
      |  | STCA: Utilizing a spatio-temporal cross-attention network for enhancing video person re-identification Amran Bhuiyan,
         Jimmy X. Huang
 Image and Vision Computing (IVC), 2022
 
 We propose a Spatio Temporal Cross Attention (STCA) network to generate cross guided attention for video re-identification that  leverages 2D and 3D-CNNs to identify salient features in videos, enhancing recognition accuracy through attention-based gating and optimized by cosine distance for efficient and precise video re-identification.
 |  
      |  | Exploiting prunability for person re-identification Hugo Masson * ,             
        Amran Bhuiyan * ,
         Le Thanh nguyen-meidine * ,        
         Mehrsan Javan Roshtkhari ,
         Parthipan Siva ,
        Ismail Ben Ayed,    
         Eric Granger
         (* Equal Contribution)
 EURASIP Journal on Image and Video Processing , 2021
 
 In this paper, we investigate the prunability of the different CNN architectures under different design scenarios. This paper first revisits pruning techniques that are suitable for reducing the computational complexity of deep CNN networks applied to person re-identification. Then, these techniques are analyzed according to their pruning criteria and strategy and according to different scenarios for exploiting pruning methods to fine-tuning networks to target domains.
 |  
      |  | Flow guided mutual attention for person re-identification Madhu Kiran ,            
        Amran Bhuiyan  ,
         Le Thanh nguyen-meidine * ,        
         Mehrsan Javan Roshtkhari ,
         Louis-Antoine Blais-Morin
             Ismail Ben Ayed,    
         Eric Granger
 Image and Video Computing , 2021
 
 In this paper, the motion pattern of a person is explored as an additional cue for ReID. In particular, a flow-guided Mutual Attention network is proposed for fusion of bounding box and optical flow sequences
          over tracklets using any 2D-CNN backbone, allowing to encode temporal information along with spatial appearance information. Our Mutual Attention network relies on the joint spatial attention between image and optical
          flow feature maps to activate a common set of salient features.
 |  
            |  | Pose Guided Gated Fusion for Person Re-identification Amran Bhuiyan  ,
               Yang Liu ,            
             
               Le Thanh nguyen-meidine * ,  
               Parthipan Siva ,      
               Mehrsan Javan Roshtkhari ,
                               Ismail Ben Ayed,    
               Eric Granger
 Winter Conference on Applications of Computer Vision (WACV) , 2020
 
 In this paper, a new deep learning model is proposed for pose-guided re-identification, comprised of a deep back- bone, pose estimation, and gated fusion network. Given a query image of an individual, the backbone convolutional NN produces a feature embedding required for pair-wise matching with embeddings for reference images, where fea- ture maps from the pose network and from mid-level CNN layers are combined by the gated fusion network to gen- erate pose-guided gating. The proposed framework al- lows to dynamically activate the most discriminant CNN filters based on pose information in order to perform a finer grained recognition.
 |  
                  |  | Unsupervised Domain Adaptation in the Dissimilarity Space for Person Re-identification Djebril Mekhazni, 
                    Amran Bhuiyan  ,
                               
                   
                     George S. Eskander Ekladious,  
                                        Eric Granger
 European Conference on Computer Vision (ECCV) , 2020
 
 In this paper, we propose a novel Dissimilarity-based Maximum Mean Discrepancy (D-MMD) loss for aligning pair-wise distances that can be optimized via gradient descent using relatively small batch sizes. From a person ReID perspective, the evaluation of D-MMD loss is straightforward since the tracklet information (provided by a person tracker) allows to label a distance vector as being either within-class (within-tracklet) or between-class (between-tracklet). This allows approximating the underlying distribution of target pair-wise distances for D-MMD loss optimization, and accordingly align source and target distance distributions. Empirical results with three challenging benchmark datasets show that the proposed D-MMD loss decreases as source and domain distributions become more similar.
 |  
                      |  | Unsupervised Domain Adaptation in the Dissimilarity Space for Person Re-identification Rameswar Panda,
                        Amran Bhuiyan  ,
                         Vittorio Murino ,
                       
                         Amit K. Roy-Chowdhury
 Pattern Recognition (PR) , 2019.
 
 This paper extends our CVPR 2017 paper providing a new source-target selective adaptation strategy and rigorous experiments on more person re-id datasets.
 |  
                          |  | RGB-Depth Cross-Modal Person Re-identification Frank M. Hafner , 
                            Amran Bhuiyan  ,
                             Julian F. P. Kooij , 
                           
                             Eric Granger
 IEEE Conference on Advanced Video and Signal Based Surveillance (AVSS), 2019.
 
 We develop a novel cross-modal distillation network for robust
                              person re-identification, which learns a shared feature representation space of person’s appearance in both RGB and
                              depth images.
 |  
                                |  | Unsupervised Adaptive Re-identification in Open World Dynamic Camera Networks Rameswar Panda *,
                                  Amran Bhuiyan*  ,
                                   Vittorio Murino , 
                                 
                                   Amit K. Roy-Chowdhury  
                                                    
                               (* Equal Contribution)
 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
 
 We propose an unsupervised adaptation scheme for re-identification models where a new camera may be temporarily inserted into an existing system to get additional information.
 |  
                                                              |  | Exploiting Gaussian Mixture Importance for Person Re-identification Xiangping Zhu , 
                                                                 Amran Bhuiyan  ,
                                                                 Mohamed Lamine Mekhalfi,
                                                                 Vittorio Murino
 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), 2017.
 
 We propose a Gaussian Mixture Importance Estimation (GMIE) approach for ReID, which exploits the Gaussian Mixture Models (GMMs) to estimate the observed commonalities of similar and dissimilar person pairs in the feature space.
 |  
                                                              |  | Person re-identification using sparse representation with manifold constraints Behzad Mirmahboub, 
                                                                 Hamed Kiani , 
                                                                Amran Bhuiyan  ,
                                                                 Alessandro Perina ,
                                                                 Baochang zhang ,                                                                
                                                                 Alessio Del Bue ,
                                                                 Vittorio Murino
 IEEE International Conference on Image Processing, 2016.
 
 In this paper, we propose a novel framework that combines sparse coding and manifold constraints to extract discriminative information from multi-shot images of one pedestrian for person re-identification across a set of non-overlapped surveillance cameras.
 |  
                                                                  |  | Exploiting multiple detections to learn robust brightness transfer functions in re-identification systems Amran Bhuiyan  ,                                                                   
 Alessandro Perina ,
 Vittorio Murino
 IEEE International Conference on Image Processing, 2015.
 
 This paper proposes the use of Cumulative Weighted Brightness Transfer Functions to model this appearance variations. It is multiple frame-based learning approach which leverages consecutive detections of each individual to transfer the appearance, rather than learning brightness transfer function from pairs of images.
 |  
                                                                  |  | Person Re-identification Using Robust Brightness Transfer Functions Based on Multiple Detections Amran Bhuiyan  , 
                                                                     Behzad Mirmahboub,                                                                   
 Alessandro Perina ,
 Vittorio Murino
 International Conference on Image Analysis and Processing , 2015.
 
 This paper proposes the use of Minimum Multiple Cumulative Brightness Transfer Functions to model this appearance variations. It is multiple frame-based learning approach which leverages consecutive detections of each individual to transfer the appearance, rather than learning brightness transfer function from pairs of images.
 |  
                                                                                                                                          |  | Person re-identification by discriminatively selecting parts and features Amran Bhuiyan  ,                                                                   
                                                                         Alessandro Perina ,
                                                                        
                                                                         Vittorio Murino
 European Conference on Computer Vision (ECCV)- Workshop on Visual Surveillance and Re-Identification (VS-Re-ID), 2014.
                                                                                                                                               Winner of the INTEL Best paper award
 
 This paper presents a novel appearance-based method for person re-identification. The core idea is to rank and select different body parts on the basis of the discriminating power of their characteristic features. In our approach, we first segment the pedestrian images into meaningful parts, then we extract features from such parts as well as from the whole body and finally, we perform a salience analysis based on regression coefficients.
 |  |