JUAN CARLOS
PICHEL CAMPOS
Profesor titular de universidad
FRANCISCO MANUEL
FERNANDEZ RIVERA
Catedrático de universidad
Publicaciones en las que colabora con FRANCISCO MANUEL FERNANDEZ RIVERA (22)
2016
-
Power and Energy Implications of the Number of Threads Used on the Intel Xeon Phi
Annals of Multicore and GPU Programming: AMGP, Vol. 3, Núm. 1, pp. 55-65
2015
-
Power and energy implications of the number of threads used on the Intel Xeon Phi
Annals of Multicore and GPU Programming: AMGP, Vol. 2, Núm. 1, pp. 55-65
2014
-
3DyRM: a dynamic roofline model including memory latency information
Journal of Supercomputing, Vol. 70, Núm. 2, pp. 696-708
-
A hardware counter-based toolkit for the analysis of memory accesses in SMPs
Concurrency Computation Practice and Experience, Vol. 26, Núm. 6, pp. 1328-1341
-
Multiobjective optimization technique based on monitoring information to increase the performance of thread migration on multicores
2014 IEEE International Conference on Cluster Computing, CLUSTER 2014
-
Using an extended Roofline Model to understand data and thread affinities on NUMA systems
Annals of Multicore and GPU Programming: AMGP, Vol. 1, Núm. 1, pp. 56-67
-
Using sampled information: Is it enough for the sparse matrix-vector product locality optimization?
Concurrency Computation Practice and Experience, Vol. 26, Núm. 1, pp. 98-117
2013
-
A flexible and dynamic page migration infrastructure based on hardware counters
Journal of Supercomputing, Vol. 65, Núm. 2, pp. 930-948
-
Sparse matrix-vector multiplication on the Single-Chip Cloud Computer many-core processor
Journal of Parallel and Distributed Computing, Vol. 73, Núm. 12, pp. 1539-1550
2012
-
A graphical tool for performance analysis of multicore systems based on the Roofline Model
Proceedings of the 2012 10th IEEE International Symposium on Parallel and Distributed Processing with Applications, ISPA 2012
-
Experiences with the sparse matrix-vector multiplication on a many-core processor
Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012
-
Hardware counters based analysis of memory accesses in SMPs
Proceedings of the 2012 10th IEEE International Symposium on Parallel and Distributed Processing with Applications, ISPA 2012
-
Optimization of sparse matrix-vector multiplication using reordering techniques on GPUs
Microprocessors and Microsystems, Vol. 36, Núm. 2, pp. 65-77
2010
-
Increasing the locality of iterative methods and its application to the simulation of semiconductor devices
International Journal of High Performance Computing Applications, Vol. 24, Núm. 2, pp. 136-153
-
Lessons learnt porting parallelisation techniques for irregular codes to NUMA systems
Proceedings of the 18th Euromicro Conference on Parallel, Distributed and Network-Based Processing, PDP 2010
2009
-
Increasing data reuse of sparse algebra codes on simultaneous multithreading architectures
Concurrency Computation Practice and Experience, Vol. 21, Núm. 15, pp. 1838-1856
-
On the influence of thread allocation for irregular codes in NUMA systems
Parallel and Distributed Computing, Applications and Technologies, PDCAT Proceedings
2006
-
Image segmentation based on merging of sub-optimal segmentations
Pattern Recognition Letters, Vol. 27, Núm. 10, pp. 1105-1116
2005
-
A new technique to reduce false sharing in parallel irregular codes based on distance functions
Proceedings of the International Symposium on Parallel Architectures, Algorithms and Networks, I-SPAN
-
Performance optimization of irregular codes based on the combination of reordering and blocking techniques
Parallel Computing, Vol. 31, Núm. 8-9, pp. 858-876