Get in Touch

Course Outline

Overview of the Domestic AI GPU Ecosystem

  • Comparison of Huawei Ascend, Biren, and Cambricon MLU.
  • Analysis of CUDA versus CANN, Biren SDK, and BANGPy programming models.
  • Industry trends and vendor ecosystem dynamics.

Preparation for Migration

  • Assessing the existing CUDA codebase.
  • Identifying target platforms and necessary SDK versions.
  • Installing toolchains and setting up the development environment.

Code Translation Techniques

  • Porting CUDA memory access patterns and kernel logic.
  • Mapping compute grid and thread models.
  • Evaluating automated versus manual translation approaches.

Platform-Specific Implementations

  • Utilizing Huawei CANN operators and writing custom kernels.
  • Navigating the Biren SDK conversion pipeline.
  • Rebuilding models using BANGPy (Cambricon).

Cross-Platform Testing and Optimization

  • Profiling execution on each target platform.
  • Conducting memory tuning and comparing parallel execution methods.
  • Monitoring performance and iterating on optimizations.

Managing Mixed GPU Environments

  • Implementing hybrid deployments across multiple architectures.
  • Developing fallback strategies and device detection mechanisms.
  • Using abstraction layers to ensure code maintainability.

Case Studies and Best Practices

  • Porting computer vision and NLP models to Ascend or Cambricon.
  • Integrating inference pipelines within Biren clusters.
  • Addressing version mismatches and API discrepancies.

Summary and Next Steps

Requirements

  • Experience in programming with CUDA or GPU-accelerated applications.
  • Understanding of GPU memory models and compute kernels.
  • Familiarity with AI model deployment or acceleration workflows.

Target Audience

  • GPU programmers
  • System architects
  • Porting specialists
 21 Hours

Number of participants


Price per participant

Upcoming Courses

Related Categories