JUAN CARLOS
PICHEL CAMPOS

Profesor titular de universidad

Foto de JUAN CARLOS

Foto de FRANCISCO MANUEL

FRANCISCO MANUEL
FERNANDEZ RIVERA

Catedrático de universidad

Publicaciones en las que colabora con FRANCISCO MANUEL FERNANDEZ RIVERA (22)

2016

Power and Energy Implications of the Number of Threads Used on the Intel Xeon Phi
Annals of Multicore and GPU Programming: AMGP, Vol. 3, Núm. 1, pp. 55-65

2015

Power and energy implications of the number of threads used on the Intel Xeon Phi
Annals of Multicore and GPU Programming: AMGP, Vol. 2, Núm. 1, pp. 55-65

2014

3DyRM: a dynamic roofline model including memory latency information
Journal of Supercomputing, Vol. 70, Núm. 2, pp. 696-708
A hardware counter-based toolkit for the analysis of memory accesses in SMPs
Concurrency Computation Practice and Experience, Vol. 26, Núm. 6, pp. 1328-1341
Multiobjective optimization technique based on monitoring information to increase the performance of thread migration on multicores
2014 IEEE International Conference on Cluster Computing, CLUSTER 2014
Using an extended Roofline Model to understand data and thread affinities on NUMA systems
Annals of Multicore and GPU Programming: AMGP, Vol. 1, Núm. 1, pp. 56-67
Using sampled information: Is it enough for the sparse matrix-vector product locality optimization?
Concurrency Computation Practice and Experience, Vol. 26, Núm. 1, pp. 98-117

2013

A flexible and dynamic page migration infrastructure based on hardware counters
Journal of Supercomputing, Vol. 65, Núm. 2, pp. 930-948
Sparse matrix-vector multiplication on the Single-Chip Cloud Computer many-core processor
Journal of Parallel and Distributed Computing, Vol. 73, Núm. 12, pp. 1539-1550

2012

A graphical tool for performance analysis of multicore systems based on the Roofline Model
Proceedings of the 2012 10th IEEE International Symposium on Parallel and Distributed Processing with Applications, ISPA 2012
Experiences with the sparse matrix-vector multiplication on a many-core processor
Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012
Hardware counters based analysis of memory accesses in SMPs
Proceedings of the 2012 10th IEEE International Symposium on Parallel and Distributed Processing with Applications, ISPA 2012
Optimization of sparse matrix-vector multiplication using reordering techniques on GPUs
Microprocessors and Microsystems, Vol. 36, Núm. 2, pp. 65-77

2010

Increasing the locality of iterative methods and its application to the simulation of semiconductor devices
International Journal of High Performance Computing Applications, Vol. 24, Núm. 2, pp. 136-153
Lessons learnt porting parallelisation techniques for irregular codes to NUMA systems
Proceedings of the 18th Euromicro Conference on Parallel, Distributed and Network-Based Processing, PDP 2010

2009

Increasing data reuse of sparse algebra codes on simultaneous multithreading architectures
Concurrency Computation Practice and Experience, Vol. 21, Núm. 15, pp. 1838-1856
On the influence of thread allocation for irregular codes in NUMA systems
Parallel and Distributed Computing, Applications and Technologies, PDCAT Proceedings

2006

Image segmentation based on merging of sub-optimal segmentations
Pattern Recognition Letters, Vol. 27, Núm. 10, pp. 1105-1116

2005

A new technique to reduce false sharing in parallel irregular codes based on distance functions
Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks, I-SPAN
Performance optimization of irregular codes based on the combination of reordering and blocking techniques
Parallel Computing, Vol. 31, Núm. 8-9, pp. 858-876