12 Credits
January - April
This subject aims to provide the student with direct contact with the technologies, mainly in computer engineering, which allow the deployment of the data analysis tools and the development and implementation of new solutions.
M05-01 Sistemas de computación para datos masivos / Computing systems for Big Data
M05-02 Herramientas en la nube para la Ciencia de Datos / Cloud for Data Science
M05-03 Desarrollo de proyectos / Project development (OpenProject, github)
1. Architecture of an e-Infrastructure.
2. HPC and HTC computing: servers, clusters, supercomputers.
3. Classic management of a computer cluster. Queue systems. Benchmarking. Monitoring.
4. Systems interconnection networks.
5. Storage systems.
6. Data transmission on the Internet.
7. Distributed computing.
8. Parallel computing. Introduction to MPI.
9. Principles of management as a service: introduction to FitSM.
10. Systems Virtualization. Hypervisors.
11. Use of Containers and Docker.
12. Cloud environment: basic principles.
13. Infrastructure as a Service (IaaS), standards (OCCI), basic management with OpenStack.
14. Access to commercial resources: Amazon, Azure, BlueMix, Google Cloud.
15. Composition of Services and Platform as a Service (PaaS). Basic tools.
16. Software as a Service (SaaS). Examples of applications. Access to R and Python in SaaS mode.
17. Storage in Cloud environment: the standard CDMI and de-facto S3. Examples of local data integration (CEPH) and distributed (OneData).
18. SaaS Platforms for Big Data.
19. Introduction to the methodology of projects.
20. Case Study Design.
21. Software development. Agile Methodology.
22. Version control. Github.
23. Software deployment in distributed environments.
24. Global project management.
25. Application of FitSM in project development.
26. Service to third parties: SLA (Service Level Agrement) and CRM (Customer Relationship Management).